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ALIGNMENTS 



RESULT 1 
AAB74665 

ID AAB74665 standard; protein; 580 AA. 
XX 

AC AAB7 4 665; 
XX 

DT 01-JUN-2001 (first entry) 
XX 

DE Human high affinity choline transporter protein. 
XX 

KW High affinity choline transporter; cho-1; Alzheimer ! s disease; diagnosis. 
XX 

OS Homo sapiens. 
XX 

PN WO200116315-A1. 
XX 



PD 08-MAR-20Q1. 
XX 

PF 18-AUG-2000; 2000WO- JP005545 . 
XX 

PR 27-AUG-1999; 99 JP-002 4 0642 . 

PR 27-DEC-1999; 99 JP-00368991 . 
XX 

PA (NISC-) JAPAN SCI & TECHNOLOGY CORP. 
XX 

PI Haga T, Okuda T; 
XX 

DR WPI; 2001-226688/23. 

DR N-PSDB; AAF81712. 
XX 

PT New rat and human spinal cord high affinity choline transporters, useful 

PT in diagnosis of Alzheimer's disease and screening promoters as drugs for 

PT treating Alzheimer's disease. 
XX 

PS Claim 8; Page 76-78; 90pp; Japanese. 
XX 

CC The present sequence represents a human (Homo sapiens) high affinity 

CC choline transporter protein designated cho-1. The cho-1 protein has 

CC nootropic and neuroprotective activities. The cho-1 polynucleotide and 

CC protein can be used for the diagnosis of diseases related to the 

CC expression of cho-1 by comparing the cho-1 polynucleotide sequence in a 

CC sample to that of a control. Drug compositions containing the cho-1 

CC protein or expression promoters or inhibitors of cho-1 are useful for 

CC treating disorders characterised by abnormal levels of cho-1, such as 

CC Alzheimer's disease 
XX 

SQ Sequence 580 AA; 

Query Match 100.0%; Score 2972; DB 4; Length 580; 

Best Local Similarity 100.0%; Pred. No. 3.6e-290; 

Matches 58 0; Conservative 0; Mismatches 0; Indels 0; Gaps 0 

Qy 1 MAFH VE GLIAIIVFYLLIL L VG I WAAW RT KN S G S AE E RS EAI I VG G R D I G L L VG G FTMT A 60 

I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I 
Db 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

Qy 61 T WVGGGY I NGTAEAVYVP G YGLAWAQAP I G Y S L S L I LGGL F FAKPMRS KGYVTMLD P FQQ 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 TWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 12 0 

Qy 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALI ATLYTLVGG 180 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALI ATLYTLVGG 180 

Qy 181 LYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAV1LAKYQKPWLGTVDSSEVYSW 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 LYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSW 240 

Qy 241 LDS FLLLMLGGI PWQAYFQRVLS S S SAT YAQVLS FLAAFGCLVMAI PAI LI GAI GASTDW 300 

1 I I I I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 
Db 241 LDS FLLLMLGGI PWQAYFQRVLS SSSATYAQVLS FLAAFGCLVMAI PAI LI GAI GASTDW 300 



Qy 



301 NQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 360 



I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 301 NQTAYGLPDPKTTEEADMI LPI VLQYLCPVYI S FFGLGAVSAAVMS SADS SI LSAS SMFA 360 

Qy 361 RNI YQLS FRQNASDKEI VWVMRI TVFVFGASATAMALLTKTVYGLWYLS S DLVYI VI FPQ 420 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
Db 3 61 RNIYQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 42 0 

Qy 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGI YNQKFPFK 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGI YNQKFPFK 48 0 

Qy 481 TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 540 

I I I I I I II I I I I I I I I I I I M I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 481 TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTI 540 

Qy 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 

I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
Db 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 



RESULT 2 


AAB86837 


ID 


AAB86837 standard; protein; 580 AA. 


XX 




AC 


AAB8683 7 ; 


w 

AA 




DT 


zo-NOV-^Uul (rirst entry) 


w 
AA 




Dbj 


Human CHOT protein. 


vv 

AA 




KW 


CHOT; human; choline transporter; chromosome 2qll-13; nootropics- 


KW 


neuroprotective; gene therapy; antisense therapy; degenerative disease; 


KW 


cognitive disorder; Alzheimer's disease. 


XX 




OS 


Homo sapiens. 


XX 




PN 


DE10009055-A1. 


XX 




PD 


30-AUG-2001. 


XX 




PF 


28-FEB-2000; 2000DE-01009055 . 


XX 




PR 


28-FEB-2000; 2000DE-01009055 . 


XX 




PA 


(BRUE/) BRUESS M. 


PA 


(BCEN/) BOENISCH H. 


XX 




PI 


Bruess M, Boenisch H; 


XX 




DR 


WPI; 2001-590709/67. 


DR 


N-PSDB; AAH49207. 


XX 




PT 


A new gene encoding human choline transporter, designated hCHOT is 


PT 


located on chromosome 2qll-13 and is useful to treat degenerative 


PT 


disorders such as Alzheimer ! s disease. 


XX 




PS 


Disclosure; Page 11; 12pp; German. 



XX 

CC This invention describes a novel gene encoding human choline transporter, 

CC designated hCHOT which is located on chromosome 2qll-13. The products of 

CC the invention have nootropic and neuroprotective activity and can be used 

CC for gene or antisense therapy. (I) is used to treat degenerative disease, 

CC particularly cognitive disorders such as Alzheimer's disease. Sense and 

CC antisense oligonucleotides derived from the gene may be used in 

CC diagnostics and other techniques. This sequence represents the human CHOT 

CC protein described in the invention 
XX 

SQ Sequence 580 AA; 

Query Match 100.0%; Score 2972; DB 4; Length 580; 

Best Local Similarity 100.0%; Pred. No. 3.6e-290; 

Matches 58 0; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I 
Db 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

Qy 61 TWGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 12 0 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I 
Db 61 TWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 

Qy 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGAT I SVI I DVDMHI SVI I SALIATLYTLVGG 180 

I M I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
Db 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGAT I SVI I DVDMHI SVI I SALIATLYTLVGG 180 

Qy 181 LYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSW 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 LYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSW 240 

Qy 241 L D S FL L LM L G G I P WQ AY FQ RVL S S S S AT YAQ VL S FLAAF G C L VMAI P AI L I G AI GAS T D W 3 00 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 LDS FLLLMLGGI PWQAYFQRVLS S S SAT YAQ VLS FLAAFGCL VMAI PAILIGAI GASTDW 300 

Qy 301 NQTAYGLPDPKTTEEADMI LP I VLQYLCPVYI S FFGLGAVSAAVMS SADS S ILSAS SMFA 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 NQTAYGLPDPKTTEEADMI LP I VLQYLCPVYIS FFGLGAVSAAVMS S ADS S ILSAS SMFA 360 

Qy 3 61 RNI YQL S FRQNAS DKE I VWVMRI T VFVFGAS ATAMALLTKTVYGLW YL S S DLVYI VI FPQ 420 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 RNI YQLS FRQNAS DKE I VWVMRI T VFVFGASATAMALLTKTVYGLWYL S S DLVYI VI FPQ 420 

Qy 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 4 80 

I I I I i I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I i i I I I 
Db 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 480 

Qy 481 TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 54 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I II I I I I I I 
Db 481 TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAVVARHSEENMDKTILVKNENIKLD 540 

Qy 541 ELALVKPRQSMTLS STFTNKEAFLDVDS S PEGSGTEDNLQ 580 

I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I 
Db 541 ELALVKPRQSMTLS STFTNKEAFLDVDS S PEGSGTEDNLQ 580 



RESULT 3 
ABU08979 

ID ABU08979 standard; protein; 580 AA. 
XX 

AC ABU08979; 
XX 

DT 13-JUN-2003 (first entry) 
XX 

DE Human high affinity choline transporter, HACT. 
XX 

KW Human; HACT; high affinity choline transporter; pain; 

KW neurotransmitter biosynthesis; learning and memory; aging; epilepsy; 

KW neurological disorder; spasticity; myoclonus; muscle spasm; 

KW muscle hyperactivity; stroke; head trauma; neuronal cell death; 

KW. multiple sclerosis; spinal chord injury; dystonia; Alzheimer's disease; 

KW Myasthenia Gravis; multi-inf arct dementia; AIDS dementia; 

KW Parkinson's disease; Huntington's disease; amyotrophic lateral sclerosis; 

KW ALS; attention deficit disorder; organic brain syndrome; schizophrenia; 

KW nicotine addiction; memory disorder; cognitive disorder. 
XX 

OS Homo sapiens. 
XX 

PN US6500643-B1. 
XX 

PD 31-DEC-2002. 
XX 

PF 07-SEP-2000; 2 000US-00657252 . 
XX 

PR 07-SEP-2000; 2 000US-00657252 . 
XX 

PA (UYFL ) UNIV FLORIDA. 
XX 

PI Wu D, Gu Y, Millard WJ, He Y; 
XX 

DR WPI; 2003-361535/34. 

DR N-PSDB; ABX94338. 
XX 

PT Novel isolated polynucleotide (I) that encodes high affinity choline 

PT transporter protein, useful for preventing, treating or ameliorating 

PT neurological and cognitive disorders such as Alzheimer's or Parkinson's 

PT disease. 
XX 

PS Claim 1; Col 21-24; 20pp; English. 
XX 

CC The invention relates to an isolated polynucleotide which encodes a high 

CC affinity choline transporter (HACT) protein appearing as ABU08979. Also 

CC included are a polynucleotide encoding a fragment consisting of at least 

CC about 50 amino acids of the HACT protein, a vector comprising the 

CC polynucleotide, a composition comprising a vector comprising a 

CC polynucleotide which comprises at least about 12 contiguous nucleic acids 

CC of a polynucleotide appearing as ABX94339 (encoding choline 

CC acetyltransf erase) , a recombinant host cell which comprises the vector 

CC (used to express the HACT protein or fragment) . The polynucleotide is 

CC useful as a probe or primer to detect the presence of HACT polynucleotide 

CC in a sample, such as a biological sample, or for screening for test 

CC agents which bind to the polynucleotide. A pharmaceutical composition 

CC comprising the polynucleotide is useful for preventing, treating or 



CC ameliorating neurological and cognitive disorders e.g. pain, spasticity, 

CC myoclonus, muscle spasm, muscle hyperactivity, epilepsy, stroke, head 

CC trauma, neuronal cell death, multiple sclerosis, spinal chord injury, 

CC dystonia, Alzheimer's disease, myasthenia gravis, multi- infarct 

CC dementia, AIDS dementia, Parkinson's disease, Huntington 1 s disease, 

CC amyotrophic lateral sclerosis (ALS) , attention deficit disorder, nicotine 

CC addiction, organic brain syndromes, schizophrenia or memory and cognitive 

CC disorders. HACT is thought to be the rate limiting step in cholinergic 

CC neurotransmitter biosynthesis and regeneration (cholinergic transmissions 

CC are crucial to brain functions such as learning and memory) . The present 

CC sequence represents human HACT 
XX 

SQ Sequence 580 AA; 



Query Match 100.0%; Score 2972; DB 6; Length 580; 

Best Local Similarity 100.0%; Pred. No. 3.6e-290; 

Matches 58 0; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

I I I I I I I I I II I I I I I I I II I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

Qy 61 TWVGGG YI N GT AEAVYVP G YGLAWAQAP I G Y S L S L I L GGL F FAK PMRS KG YVTMLD P FQQ 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 
Db 61 TWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 12 0 

Qy 121 IYGKRMGGLLFIPALMGEMFWAAAIFSALGATI SVIIDVDMHISVI I SALIATLYTLVGG 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 IYGKRMGGLLFIPALMGEMFWAAAIFSALGATI SVIIDVDMHISVI I SALIATLYTLVGG 180 

Qy 181 L Y S VAYT DWQ L F C I FVGLW I S VP FAL S H P AVAD I G FT AVHAK YQ K P WL GT VD S S E VY S W 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 LYSVAYTDWQLFCIFVGLWISVPFT^SHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSW 240 

Qy 241 LDSFLLLMLGGIPWQAYFQRVLSSSSATYAQVLSFLAAFGCLVMAIPAILIGAIGASTDW 300 

I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 241 LDSFLLLMLGGIPWQAYFQRVLSSSSATYAQVLSFLAAFGCLVMAI PAILIGAI GASTDW 300 

Qy 301 NQTAYGLPDPKTTEEADMILPIVLQYLCPVTISFFGLGAVS7VAVMSSADSSILSASSMFA 360 

I I I I I II I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 301 NQTAYGLPDPKTTEEADMI LPI VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFA 360 

Qy 361 RNIYQLSFRQNASDKEIWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 420 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 RNI YQLSFRQNASDKEIVWV]VIRITVFVFGASATAMALLTKTWGLWYLSSDLWIVIFPQ 420 

Qy 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 48 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I 
Db 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 480 

Qy 481 TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 540 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I II I II I I I I I I I I I I I I I I I I I 

Db 481 TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 540 

Qy 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 

II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 

Db 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 



RESULT 4 
ADD50649 

ID ADD50649 standard; protein; 580 AA. 
XX 

AC ADD50649; 
XX 

DT 15-JAN-2004 (first entry) 
XX 

DE High-affinity choline transporter (CHT) associated protein sequence #3. 
XX 

KW High-affinity choline transporter; CHT; cholinergic function; 

KW Parkinson's disease; Huntington's disease; Alzheimer's disease; 

KW schizophrenia; dysautonomia; myasthenia gravis; brain; 

KW cholinergic signalling; antiparkinsonian; anticonvulsant; nootropic; 

KW neuroprotective; neuroleptic. 

XX 

OS Unidentified. 
XX 

PN US2003114399-A1. 
XX 

PD 19-JUN-2003. 
XX 

PF 23-JUL-2001; 2 001US-00911077 . 
XX 

PR 23-JUL-2001; 2 001US-00911077 . 
XX 

PA (BLAK/) BLAKELY R D. 

PA (APPA/) AP PARS UN D ARAM S. 

PA (FERG/) FERGUSON S. 

XX 

PI Blakely RD, Apparsundaram S, Ferguson S; 
XX 

DR WPI; 2003-810914/76. 
XX 

PT Novel isolated polynucleotide encoding human or mouse high affinity 

PT choline transporter polypeptide, useful in gene therapy to increase 

PT cholinergic function in a cell of a patient suffering from Alzheimer's 

PT disease. 
XX 

PS Disclosure; SEQ ID NO 12; 74pp; English. 
XX 

CC The present invention relates to the isolation of polynucleotide 

CC sequences encoding human and mouse high-affinity choline transporter 

CC (hCHT and mCHT respectively), and the proteins they encode. The gene 

CC encoding hCHT is located on chromosome 2ql2. The polynucleotide sequence 

CC encoding hCHT, is useful for expressing hCHT recombinantly . The hCHT 

CC polynucleotide sequence when delivered to a cell, increases cholinergic 

CC function in the cell that is in a patient having Parkinson's disease, 

CC Huntington's disease, Alzheimer's disease, schizophrenia, dysautonomia or 

CC myasthenia gravis. The hCHT antibody is useful for controlling 

CC transporter CHT proteins to the brain, and for treating the above 

CC mentioned diseases. The antibody is also useful for diagnosing the above 

CC mentioned disorders and to detect the influence of cholinergic 

CC signalling. The present protein sequence of unknown function is provided 

CC in the electronic sequence data but is not mentioned in the printed 



CC specification. Note: The sequence data for this patent was obtained in 

CC electronic format directly from the USPTO web site at seqdata.uspto.gov. 
XX 

SQ Sequence 580 AA; 

Query Match 100.0%; Score 2972; DB 7; Length 580; 

Best Local Similarity 100.0%; Pred. No. 3.6e-290; 

Matches 580; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I 
Db 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

Qy 61 TWGGGYiNGTAEAVTVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 TWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 12 0 

Qy 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGAT I S VI I DVDMHI SVI I SALIATLYTLVGG 180 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
Db 121 I YGKRMGGL L F I PALMGEMFWAAAI F S ALGAT I SVI I DVDMH I SVI I SAL I AT L YT L VGG 180 

Qy 181 L Y S VAYT DWQL FC I FVGLWI S VP FAX. S H P AVAD I G FT AVHAK YQ K P WL GT VD S S E VY S W 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II I I I I I 
Db 181 L Y S VAYT DWQ L F C I FVG LWI S VP FAL S H P AVAD I G FT AVHAK YQ K P WL GT VD S S EVY S W 24 0 

Qy 241 LDS FLLLMLGGI PWQAYFQRVLS S S SAT YAQVLS FLAAFGCLVMAI PAI LI GAI GASTDW 300 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 LDS FLLLMLGGI PWQAYFQRVLS SS SAT YAQVLS FLAAFGCLVMAI PAI LI GAI GASTDW 30 0 

Qy 301 NQTAYGLPDPKTTEEADMI LP I VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFA 360 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 301 NQTAYGLPDPKTTEEADMI LPIVLQYLCPVYIS FFGLGAVSAAVMS SADS SI LSAS SMFA 360 

Qy 361 RNI YQLS FRQNASDKEI VWVMRITVFVFGASATAMALLTKTVYGLWYLS S DLVYIVI FPQ 42 0 

I I I I I I I I I Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 

Db 361 RNIYQLSFRQNASDKEIWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 420 

Qy 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 480 

I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 48 0 

Qy 481 TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 540 

I I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I 
Db 481 TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 540 

Qy 541 ELALVKPRQSMTLS ST FTNKEAFLDVDSS PEGSGTEDNLQ 580 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 541 ELALVKPRQSMTLS ST FTNKEAFLDVDSS PEGSGTEDNLQ 580 



RESULT 5 
ADD50639 

ID ADD50639 standard; protein; 580 AA. 
XX 

AC ADD50639; 
XX 

DT 15-JAN-2004 (first entry) 



XX 

DE Human high-affinity choline transporter (hCHT) . 
XX 

KW Human; high-affinity choline transporter; hCHT; cholinergic function; 

KW Parkinson's disease; Huntington f s disease; Alzheimer's disease; 

KW schizophrenia; dysautonomia; myasthenia gravis; brain; 

KW cholinergic signalling; antiparkinsonian; anticonvulsant; nootropic; 

KW neuroprotective; neuroleptic. 

XX 

OS Homo sapiens. 
XX 

PN US2003114399-A1. 
XX 

PD 19-JUN-2003. 
XX 

PF 23-JUL-2001; 2001US-00911077 . 
XX 

PR 23-JUL-2001; 2001US-00911077 . 
XX 

PA (BLAK/) BLAKELY R D. 

PA (APPA/) AP PARS UN D ARAM S. 

PA (FERG/) FERGUSON S. 

XX 

PI Blakely RD, Apparsundaram S, Ferguson S; 
XX 

DR WPI; 2003-810914/76. 

DR N-PSDB; ADD50638. 
XX 

PT Novel isolated polynucleotide encoding human or mouse high affinity 

PT choline transporter polypeptide, useful in gene therapy to increase 

PT cholinergic function in a cell of a patient suffering from Alzheimer's 

PT disease. 
XX 

PS Claim 1; SEQ ID NO 2; 74pp; English. 
XX 

CC The present invention relates to the isolation of polynucleotide 

CC sequences encoding human and mouse high-affinity choline transporter 

CC (hCHT and mCHT respectively), and the proteins they encode. The gene 

CC encoding hCHT is located on chromosome 2ql2. The polynucleotide sequence 

CC encoding hCHT, is useful for expressing hCHT recombinantly . The hCHT 

CC polynucleotide sequence when delivered to a cell, increases cholinergic 

CC function in the cell that is in a patient having Parkinson's disease, 

CC Huntington's disease, Alzheimer's disease, schizophrenia, dysautonomia or 

CC myasthenia gravis. The hCHT antibody is useful for controlling 

CC transporter CHT proteins to the brain, and for treating the above 

CC mentioned diseases. The antibody is also useful for diagnosing the above 

CC mentioned disorders and to detect the influence of cholinergic 

CC signalling. The present sequence represents hCHT. Note: The sequence data 

CC for this patent was obtained in electronic format directly from the USPTO 

CC web site at seqdata.uspto.gov. 

XX 

SQ Sequence 580 AA; 

Query Match 100.0%; Score 2972; DB 7; Length 580; 

Best Local Similarity 100.0%; Pred. No. 3.6e-290; 

Matches 58 0; Conservative 0; Mismatches 0; Indels 0; Gaps 0 



Qy 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II It I I I I t I I I I I I I 
Db 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

QY 61 TWVGGGYINGT7\KAVWPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 TWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 

Qy 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI IDVDMHI SVT I SALI ATLYTLVGG 180 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I II II I I I I I I I I I 
Db 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALI ATLYTLVGG 180 

Qy 181 L Y S VAYT D WQ LFC I FVGLWI S VP FALS H PAVAD I G FTAVHAK YQ K PWLGT VD S S EVYS W 240 

I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I 

Db 181 LYSVAYTDWQLFCI FVGLWI S VP FALSH PAVAD I G FTAVHAK YQKPWLGTVDSS EVYS W 24 0 

Qy 241 LDS FLLLMLGGI PWQAYFQRVLS S SSATYAQVLS FLAAFGCLVMAI PAI LI GAIGASTDW 300 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | 
Db 241 LDS FLLLMLGGI PWQAYFQRVLS SS SAT YAQVLS FLAAFGCLVMAI PAI LI GAIGASTDW 300 

Qy 301 NQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 360 

I M I I I I I I I I I I I I II I I I I I I I I I I I II I II I I I I I I I I I I I M I I I I I I II I I I I I I 
Db 301 NQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 360 

Qy 361 RNI YQLS FRQNAS DKEI VWVMRITVFVFGASATAMALLTKTVYGLWYLS S DLVYI VI FPQ 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 RNI YQLS FRQNAS DKEIVWVMRITVFVFGAS AT AMALLTKTVYGLWYL S S DLVYI VI FPQ 420 

Qy 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 480 

M I I I I I I M I I I I I I I I I I I I I II I I I I II I II II I I I II I I I I I I I I I I I II I I I I I I 
Db 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGI YNQKFPFK 480 

Qy 481 T LAMVT S FLTN I C I S YLAK YL FES GT L P P KLD VFDAWARH S EENMD KT I LVKNEN I KLD 540 

I I I I I I M I I I I I I I I M I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | 
Db 481 TIJ^VTSFLTNICISYLAKYLFESGTLPPKLDVFDAVVARHSEENMDKTILVKNENIKLD 540 

Qy 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 



RESULT 6 


ADD50648 


ID 


ADD50648 standard; protein; 580 AA. 


XX 




AC 


ADD50648; 


XX 




DT 


15-JAN-2004 (first entry) 


XX 




DE 


High-affinity choline transporter (CHT) associated protein sequence #2 


XX 




KW 


High-affinity choline transporter; CHT; cholinergic function; 


KW 


Parkinson's disease; Huntington f s disease; Alzheimer's disease; 


KW 


schizophrenia; dysautonomia; myasthenia gravis; brain; 


KW 


cholinergic signalling; antiparkinsonian; anticonvulsant; nootropics- 


KW 


neuroprotective; neuroleptic. 


XX 





OS Unidentified. 
XX 

PN US2003114399-A1. 
XX 

PD 19-JUN-2003. 
XX 

PF 23-JUL-2001; 2001US-00911077 . 
XX 

PR 23-JUL-2001; 2001US-00911077 . 
XX 

PA (BLAK/) BLAKELY R D. 

PA (APPA/) AP PARS UN DARAM S. 

PA (FERG/) FERGUSON S. 

XX 

PI Blakely RD, Apparsundaram S, Ferguson S; 
XX 

DR WPI; 2003-810914/76. 
XX 

PT Novel isolated polynucleotide encoding human or mouse high affinity 

PT choline transporter polypeptide, useful in gene therapy to increase 

PT cholinergic function in a cell of a patient suffering from Alzheimer's 

PT disease. 
XX 

PS Disclosure; SEQ ID NO 11; 74pp; English. 
XX 

cc 



The present invention relates to the isolation of polynucleotide 

CC sequences encoding human and mouse high-affinity choline transporter 

CC (hCHT and mCHT respectively), and the proteins they encode. The gene 

CC encoding hCHT is located on chromosome 2ql2. The polynucleotide sequence 

CC encoding hCHT, is useful for expressing hCHT recombinantly . The hCHT 

CC polynucleotide sequence when delivered to a cell, increases cholinergic 

CC function in the cell that is in a patient having Parkinson's disease, 

CC Huntington's disease, Alzheimer's disease, schizophrenia, dysautonomia or 

CC myasthenia gravis. The hCHT antibody is useful for controlling 

CC transporter CHT proteins to the brain, and for treating the above 

CC mentioned diseases. The antibody is also useful for diagnosing the above 

CC mentioned disorders and to detect the influence of cholinergic 

CC signalling. The present protein sequence of unknown function is provided 

CC in the electronic sequence data but is not mentioned in the printed 

CC specification. Note: The sequence data for this patent was obtained in 

CC electronic format directly from the USPTO web site at seqdata.uspto.gov. 
XX 

SQ Sequence 580 AA; 

Query Match 100.0%; Score 2972; DB 7; Length 580; 

Best Local Similarity 100.0%; Pred. No. 3.6e-290; 

Matches 580; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

M I I I I I I I I I I I I I I I I I M I I I I I II I I I I I I I I I I I I I I I I MINIMI 

Db 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

Qy 61 TWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 

M II I II I I I II II I I I II I M II I II I II II II II II I I M M II I 

Db 61 TWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 12 0 



Qy 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALIATLYTLVGG 180 



Db 


121 


Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


301 


Db 


301 


Qy 


361 


Db 


361 


Qy 


421 


Db 


421 


Qy 


481 


Db 


481 


Qy 


541 


Db 


541 



II It 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 II I II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 II I 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 II I 



LDSFLLLMLGGIPWQAYFQRVLSSSSATYAQVLSFLAAFGCLVMAIPAILIGAIGASTDW 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 

L D S F L L LML G G I P WQA Y FQ RVL S S S S AT YAQ VL S FLAA FG C L VMAI P AI L I GAI GAS T DW 300 

NQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 360 
I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 



RNIYQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 420 
I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
RNIYQLS FRQNAS DKEI VWVMRI T VFVFGAS ATAMALLTKTVYGLWYLS S DLVYI VI FPQ 42 0 

LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGI YNQKFPFK 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGI YNQKFPFK 4 80 

TIAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 54 0 
I I I I I I I M I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | | | | | || | | | | | | | | | | | | | 
TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 540 



ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 

M I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II 
ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 



580 



580 



ADD5064 7; 

15-JAN-2004 (first entry) 

High-affinity choline transporter ( CHT) associated protein sequence #1. 

High-affinity choline transporter; CHT; cholinergic function; 
Parkinson f s disease; Huntington's disease; Alzheimer's disease; 
schizophrenia; dysautonomia; myasthenia gravis; brain; 
cholinergic signalling; antiparkinsonian; -anticonvulsant; nootropic- 
neuroprotective; neuroleptic . 

Unidentified. 

US2003114399-A1. 

19-JUN-2003. 

23-JUL-2001; 2001US-00911077 . 
23-JUL-2001; 2001US-00911077 . 



XX 

PA (BLAK/) BLAKELY R D. 

PA (APPA/) AP PAR SUN D ARAM S. 

PA (FERG/) FERGUSON S. 

XX 

PI Blakely RD, Apparsundaram S, Ferguson S; 
XX 

DR WPI; 2003-810914/76. 
XX 

PT Novel isolated polynucleotide encoding human or mouse high affinity 

PT choline transporter polypeptide, useful in gene therapy to increase 

PT cholinergic function in a cell of a patient suffering from Alzheimer's 

PT disease. 
XX 

PS Disclosure; SEQ ID NO 10; 74pp; English. 
XX 

CC The present invention relates to the isolation of polynucleotide 

CC sequences encoding human and mouse high-affinity choline transporter 

CC (hCHT and mCHT respectively), and the proteins they encode. The gene 

CC encoding hCHT is located on chromosome 2ql2 . The polynucleotide sequence 

CC encoding hCHT, is useful for expressing hCHT recombinantly . The hCHT 

CC polynucleotide sequence when delivered to a cell, increases cholinergic 

CC function in the cell that is in a patient having Parkinson's disease, 

CC Huntington's disease, Alzheimer's disease, schizophrenia, dysautonomia or 

CC myasthenia gravis. The hCHT antibody is useful for controlling 

CC transporter CHT proteins to the brain, and for treating the above 

CC mentioned diseases. The antibody is also useful for diagnosing the above 

CC mentioned disorders and to detect the influence of cholinergic 

CC signalling. The present protein sequence of unknown function is provided 

CC in the electronic sequence data but is not mentioned in the printed 

CC specification. Note; The sequence data for this patent was obtained in 

CC electronic format directly from the USPTO web site at seqdata.uspto.gov. 

XX 

SQ Sequence 580 AA; 



Query Match 100.0%; Score 2972; DB 7; Length 580; 

Best Local Similarity 100.0%; Pred. No. 3.6e-290; 

Matches 58 0; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

I M I I I I I I I I I M I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I N I I I I I I I 
Db 1 MAFHVEGLIAI I VFYLLI LLVGIWAAWRTKNS GSAEERSEAI IVGGRDI GLLVGGFTMTA 60 



Qy 61 TWGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 12 0 

I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 TWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 

Qy 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGAT I SVI I DVDMHI SVI I SALI ATLYTLVGG 180 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
Db 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGAT I SVI I DVDMHI SVI I SAL I ATLYTLVGG 180 



Qy 181 LYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSW 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I 
Db 181 LYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAWAKYQKPWLGTVDSSEVYSW 240 

QY 241 L D S F L L LML GG I P WQ AY FQ RVL S S S SAT Y AQ VL S F L AA F GC L VMAI P AI L I GAI GAS T D W 300 

I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 



241 LDS FLLLMLGGI PWQAYFQRVL S S S SAT YAQVLS FLAAFGCLVMAI PAI L I GAI GAST DW 300 



301 NQTAYGLPDPKTT EEADMI LP I VLQ YLCPVYI S FFGLGAVSAAVMS SAD SSI LS AS SMFA 360 




Db 



301 NQTAYGLPDPKTTEEADMI LP I VLQ YLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFA 360 



Qy 



361 RNIYQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 42 0 




Db 



361 RNIYQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTWGLWYLSSDLVYIVIFPQ 420 



QY 



421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 480 




Db 



421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGI YNQKFPFK 4 80 



Qy 



481 TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 540 




Db 



481 TLAMWSFLTNICISYIAJ^YLFESGTLPPKLDVFDAVVAJ^HSEE^DKTILVKNENIKLD 540 



Qy 



541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 




Db 



541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 



RESULT 8 
AAB74664 

ID AAB74664 standard; protein; 580 AA. 
XX 

AC AAB74664; 
XX 

DT 01-JUN-2001 (first entry) 
XX 

DE Rat high affinity choline transporter protein. 
XX 

KW High affinity choline transporter; cho-1; Alzheimer 1 s disease; diagnosis. 
XX 

OS Rattus norvegicus. 
XX 

PN WO200116315-A1. 
XX 

PD 08-MAR-2001. 
XX 

PF 18-AUG-2000; 2000WO- JP005545 . 
XX 

PR 27-AUG-1999; 99 JP-00240642 . 

PR 27-DEC-1999; 99 JP-00368991 . 
XX 

PA (NISC-) JAPAN SCI & TECHNOLOGY CORP. 
XX 

PI Haga T, Okuda T; 
XX 

DR WPI; 2001-226688/23. 

DR N-PSDB; AAF81711. 
XX 

PT New rat and human spinal cord high affinity choline transporters, useful 

PT in diagnosis of Alzheimer's disease and screening promoters as drugs for 

PT treating Alzheimer 1 s disease. 
XX 



PS Claim 5; Page 69-71; 90pp; Japanese. 
XX 

CC The present sequence represents a rat (Rattus norvegicus) high affinity 

CC choline transporter protein designated cho-1. The cho-1 protein has 

CC nootropic and neuroprotective activities. The cho-1 polynucleotide and 

CC protein can be used for the diagnosis of diseases related to the 

CC expression of cho-1 by comparing the cho-1 polynucleotide sequence in a 

CC sample to that of a control. Drug compositions containing the cho-1 

CC protein or expression promoters or inhibitors of cho-1 are useful for 

CC treating disorders characterised by abnormal levels of cho-1, such as 

CC Alzheimer f s disease 
XX 

SQ Sequence 580 AA; 



Query Match 94.9%; Score 2820; DB 4; Length 580; 

Best Local Similarity 93.1%; Pred. No. 7.6e-275; 

Matches 540; Conservative 24; Mismatches 16; Indels 0; Gaps 0 

Qy 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

I I I I M I : I I I : I I I I I I I I I I I I I : I I I I I : I I I I I I I I I I I I I I I I | | | | | | | | | | 
Db 1 MPFHVEGLVAIILFYLLIFLVGIWAAWKTKNSGNAEERSEAIIVGGRDIGLLVGGFTMTA 60 

QV 61 TWVGGG YI NGTAEAVYVPG YGLAWAQAP I GYS LS L I LGGLFFAKPMRS KG YVTMLDP FQQ 12 0 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I M | | | | | | | | | | | 
Db 61 TWVGGG YINGTAEAVYGPGCGLAWAQAPI GYS LS LI LGGLFFAKPMRS KGYVTMLDP FQQ 12 0 

Qy 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI S VI I DVDMHI SVI I SALI ATLYTLVGG 180 

I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I :: I I I I : I I I I I I I I I | | | 
Db 121 I YGKRMGGLLFI PALMGEMFWAAAIFSALGATI SVI I DVD VNI SVI VSALIAI LYTLVGG 180 

Qy 181 L Y S VAYT DWQ L FC I FVG LW I S VP FAL S H P AVAD I G FT AVHAK YQ K P WL GT VD S S E VY S W 24 0 

I I I II I I ! I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I | II I I I I I I I :: I I I I : I 
Db 181 LYSVAYTDWQLFCIFIGLWISVPFALSHPAVTDIGFTAVHAKYQSPWLGTIESVEVYTW 240 

Qy 241 L D S F L L LML G G I P WQ AY FQ RVL S S S SAT YAQVL S FLAAFG C L VMAI P AI L I G AI GAS T D W 300 

I I : U I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I : M I I I I II I I I I I 
Db 241 LDNFLLLMLGGIPWQAYFQRVLSSSSATYAQVLSFLAAFGCLVMALPAICIGAIGASTDW 300 

Qy 301 NQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 360 

I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | | || | | | | | | | | M | | | || I I I I 
Db 301 NQT AYG FP D P KT KE EADMI L P I VLQ YLC PVYI S FFGLGAVS AAVMS SAD S S I L S AS SMFA 360 

Qy 361 RNI YQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I : I I I I 

Db 361 RNI YQLS FRQNASDKEI VWVMRITVFVFGASATAMALLTKTVYGLWYLS SDLVYI 1 1 FPQ 420 

Qy 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 48 0 

I I I I I I : I I I I I I I I I I I I : II I I I I I I I I I I I I I I I I I I I I I I I I I 111111:1111 
Db 421 LLCVLFIKGTNTYGAVAGYIFGLFLRITGGEPYLYLQPLIFYPGYYPDKNGIYNQRFPFK 480 

Qy 481 TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 540 

I I : I I I I I I I I I : I I I I I I I I I I I I I I I I II : I I I I I : M I I I I I I I I I II : I I I I I I : 
Db 4 81 TLSMVTSFFTNICVSYLAKYLFESGTLPPKLDIFDAWS RHSEENMDKTILVRNENIKLN 54 0 

Qy 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 

Ml I I I I II : I I I I I I I I I I I I I I I I I I I MM 

Db 541 ELAPVKPRQSLTLSSTFTNKEALLDVDSSPEGSGTEDNLQ 580 



RESULT 9 
ADD50643 

ID ADD50643 standard; protein; 580 AA. 
XX 

AC ADDS 064 3; 
XX 

DT 15-JAN-2004 (first entry) 
XX 

DE Rat high-affinity choline transporter (rCHT) . 
XX 

KW Rat; high-affinity choline transporter; rCHT; cholinergic function; 

KW Parkinson's disease; Huntington 1 s disease; Alzheimer's disease; 

KW schizophrenia; dysautonomia; myasthenia gravis; brain; 

KW cholinergic signalling; antiparkinsonian; anticonvulsant; nootropic; 

KW neuroprotective; neuroleptic. 

XX 

OS Rattus sp. 
XX 

PN US2003114399-A1. 
XX 

PD 19-JUN-2003. 
XX 

PF 23-JUL-2001; 2 001US-009 11077 . 
XX 

PR 23-JUL-2001; 2 001US-00911077 . 
XX 

PA (BLAK/) BLAKELY R D . 

PA (APPA/) APPARSUNDARAM S. 

PA (FERG/) FERGUSON S. 

XX 

PI Blakely RD, Apparsundaram S, Ferguson S; 
XX 

DR WPI; 2003-810914/76. 

DR N-PSDB; ADD50642. 
XX 

PT Novel isolated polynucleotide encoding human or mouse high affinity 

PT choline transporter polypeptide , useful in gene therapy to increase 

PT cholinergic function in a cell of a patient suffering from Alzheimer's 

PT disease. 
XX 

PS Example 1; SEQ ID NO 6; 74pp; English. 
XX 

CC The present invention relates to the isolation of polynucleotide 

CC sequences encoding human and mouse high-affinity choline transporter 

CC (hCHT and mCHT respectively), and the proteins they encode. The gene 

CC encoding hCHT is located on chromosome 2ql2. The polynucleotide sequence 

CC encoding hCHT, is useful for expressing hCHT recombinantly . The hCHT 

CC polynucleotide sequence when delivered to a cell, increases cholinergic 

CC function in the cell that is in a patient having Parkinson's disease, 

CC Huntington's disease, Alzheimer's disease, schizophrenia, dysautonomia or 

CC myasthenia gravis. The hCHT antibody is useful for controlling 

CC transporter CHT proteins to the brain, and for treating the above 

CC mentioned diseases. The antibody is also useful for diagnosing the above 

CC mentioned disorders and to detect the influence of cholinergic 

CC signalling. The present sequence represents rat CHT (rCHT) . Note: The 



CC sequence data for this patent was obtained in electronic format directly 

CC from the USPTO web site at seqdata.uspto.gov. 

XX 

SQ Sequence 58 0 AA; 

Query Match 94.9%; Score 2820; DB 7; Length 580; 

Best Local Similarity 93.1%; Pred. No. 7.6e-275; 

Matches 540; Conservative 24; Mismatches 16; Indels 0; Gaps 0; 



QY 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

I M I I I I : I II : I I I I I I I I I I I I I : I I I I I : I II I I I I I I | M | | | | | | | | | | | | | | 
MPFHVEGLVAIILFYLLIFLVGIWAAWKTKNSGNAEERSEAIIVGGRDIGLLVGGFTMTA 



Db 1 



60 



QY 61 TWGGGYINGTAEAWVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 12 0 

INI I II I II I II I I I I I | | | | | | | | | | || I I I I I I I I I I | | | | | | | | | | | | | 

Db 61 TWVGGGYINGTAEAVYGPGCGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 

Qy 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALIATLYTLVGG 180 

I M I I I I I I I M I I I I I I I II I I I I I I I II I I I I I I I I I I :: I I I I : I I I I I I I I I I I I 
Db 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDVNI SVIVSALI AI LYTLVGG 180 

Qy 181 L Y S VAYT DVVQ L FC I FVG LW I S VP FAL S H P AVAD I GFT AVHAK YQ K PWL GT VD S S E VY S W 24 0 

I I I I I II I I I I I I I I I : I I I I I I I I I I I I II I I I I II I I I I I I I Mill::! III:! 
Db 181 L Y S VAYT D WQL FC I FI GLW I S VP FAL S H P AVT D I GFT AVHAK YQ S PWL GT I E S VE VYTW 240 

QY 241 LD S FLLLMLGGI PWQAYFQRVL S S S SATYAQVLS FLAAFGCLVMAI PAI LI GAI GASTDW 300 

I I • I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I : I || I I I I I I I I I I 
Db 241 LDNFLLLMLGGIPWQAYFQRVLSSS SATYAQVLS FLAAFGCLVMALPAI CI GAI GASTDW 300 

Qy 301 NQTAYGLPDPKTTEEADMI LPI VLQ YLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFA 360 

Mill I I I I I II I I I I I I I I I I I I I I I I I I I I 

Db 301 NQTAYGFPDPKTKEEADMI LPI VLQ YLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFA 360 



Qy 361 RNIYQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 420 

M I I I I I I I I II I I I I I I I I I I I I I I I | | | | | I I I I I I I I I I I I II I I I I : I I I I 

RNI YQLS FRQNASDKEI WVMRITVFVFGASATAMALLTKTVYGLWYLS SDLVYI 1 1 FPQ 420 



Db 361 



Qy 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGI YNQKFPFK 480 

I I II I I : I I I I I I I I I I I I : I I I I I I II I I I II I I I I I I I I I I M | | I I M I I : I I I I 
LLCVLFIKGTNTYGAVAGYIFGLFLRITGGEPYLYLQPLIFYPGYYPDKNGIYNQRFPFK 4 80 



Db 421 



Qy 481 TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 540 

II • I I I I I I I I I : I I I I I I I I I I I I I I I I I I : I I I I I : I I I I I I I I I I I I I : I I I I I I : 
Db 481 TLSMVTSFFTNICVSYLAKYLFESGTLPPKLDIFDAWSRHSEENMDKTILVRNENIKLN 540 

Qy 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGoGTEDNLQ 580 

Ml II I I I I : I I I I I | | | | | | I I I I I II I I I I I I I I I I 
D b 541 ELAPVKPRQSLTLSSTFTNKEALLDVDSSPEGSGTEDNLQ 580 



RESULT 10 
AAY72388 

ID AAY72388 standard; protein; 580 AA. 
XX 

AC AAY72388; 
XX 

DT 24-APR-2001 (first entry) 



XX 

DE Mouse P4P6B1 OMA (obese mice adipocyte) protein. 
XX 

KW Mouse; OMA protein; obese mice adipocyte; P4P6B1; 

KW fuel metabolism disorder; therapy; obesity; diabetes; gene therapy; 

KW anorectic; antidiabetic. 

XX 

OS Mus sp . 
XX 

PN WO200078950-A2. 
XX 

PD 28-DEC-2000. 
XX 

PF 13-JUN-2000; 2000WO-US0162 17 . 
XX 

PR 22-JUN-1999; 99US-0141515P . 
XX 

PA (AMYL-) AMYLIN PHARM INC. 
XX 

PI Sierzega M, Albrandt K; 
XX 

DR WPI; 2001-112322/12. 

DR N-PSDB; AAD02457. 
XX 

PT Novel obese mice adipocyte polypeptides useful in diagnosis and treatment 

PT of disorders of fuel metabolism such as obesity or diabetes. 

XX 

PS Claim 11; Fig 3; 83pp; English. 
XX 

CC The present sequence is mouse OMA (obese mice adipocyte) protein encoded 

CC by P4P6B1 cDNA. The P4P6B1 cDNA fragment was generated by RNA 

CC fingerprinting using random primers P4 and P6. OMA is used as a 

CC diagnostic reagent for diagnosing a disorder of fuel metabolism in an 

CC underweight or an overweight individual, by detecting the transcription 

CC level of a gene encoding OMA, which is induced or repressed in an 

CC individual by a factor such as genetic obesity, fasting and refeeding of 

CC a fasted individual. OMA is useful in the generation of antibodies, for 

CC use in pharmaceutical compositions and for studying DNA/protein 

CC interactions. Nucleic acids encoding OMA are involved in gene therapy. An 

CC inhibitor of OMA or an antisense oligonucleotide that inhibits expression 

CC of OMA are useful for treating disorders of fuel metabolism such as 

CC obesity or diabetes 

XX 

SQ Sequence 580 AA; 

Query Match 94.5%; Score 2810; DB 4; Levrgth 580; 

Best Local Similarity 93.1%; Pred. No. 7.8e-274; 

Matches 540; Conservative 23; Mismatches 17; Indels 0; Gaps 0; 

Qy 1 MAFH VE G L I AI I VF Y L L I L L VG I WAAW RT KN S G S AE E R S EAI I VGGRD I GL LVGG FTMT A 60 

I • t I I I I I = I I I r I I I I I I I I I : I I I I | : I I I I I I I I I j I I I MINI 

Db 1 MSFHVEGLVAI I LFYLLILLVGIWAAWKTKNSGNPEERSEAI IVGGRDI GLLVGGFTMTA 60 

QY 61 TWGGGYINGTAEAVTVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 

I I I I I I I I I I I I I I I I II I I I I I II I I I I I II I I I I I I I I I I I | 

Db 61 TWVGGGYINGTAEAVYGPGCGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 



Qy 121 I YGKRMGGLLFI PALMGEMFWAAAI FS ALGAT I S VI I DVDMH I S VI I S AL I ATL YTLVGG 180 

I I I ! I I I I I I I I I I 1 I I I I I I I I I I I II I I I I I I I I I I I I :: | | | | : | | | | I | | | | | | | 
Db 121 I YGKRMGGLLFI PALMGEMFWAAAI FS ALGAT I S VI I DVDVN I S VI VS AL I AI LYTLVGG 180 

Qy 181 LYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSW 24 0 

I I I I I I I I I I I I I I I I : I I I I I II I I I II I II I I I II I I I I I II I I I I I :: I I I I : I 
Db 181 LYSVAYTDWQLFCI FI GLWI SVP FALSHPAVTDI GFTAVHAKYQS PWLGT I ESVEVYTW 240 

Qy 241 L D S FL L LML GG I P WQAY FQ RVL S S S SAT YAQVL S FLAAFGCLVMAI P AI L I GAI GAS T DW 300 

M : I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I | | I I : I I I I I I I I I I I I I 
Db 241 L DN FL LLMLGGI PWQAY FQ RVLS S S SAT YAQVL S FLAAFGC LVMAL P AI C I GAI GAS T DW 300 

QY 301 NQTAYGLPDPKTTEEADMI LP I VLQYLC PVYI S FFGLGAVSAAVMS S ADS S I LS AS SMFA 360 

I I I I I I Mill I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 NQTAYGYPDPKTKEEADMI LP IVLQYLC PVYI S FFGLGAVSAAVMS SADSS I LSAS SMFA 360 

Qy 361 RNIYQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | | : | | | | 
Db 361 RNI YQLSFRQNASDKEIA/WVMRITVLVFGASATAMALLTKTVYGLWYLSSDLVYIIIFPQ 420 

Qy 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 480 

I I II I I : I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I | | I I I I 111111:1111 
Db 421 LLCVLFIKGTNTYGAVAGYIFGLFLRITGGEPYLYLQPLIFYPGYYSDKNGI YNQRFPFK 480 

Qy 481 TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 540 

MHIMI I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I : I I I I I I : 
Db 481 T L SMVT S FFTN I CVS YLAK YL FE S GT L P P KLD VFDAWARH S EENMD KT I LVRN EN I KLN 540 

Qy 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 

Ml M I II I : II I I I I I II I I II || || I || | | | || | || 
Db 541 ELAPVKPRQSLTLSSTFTNKEALLDVDSSPEGSGTEDNLQ 580 



RESULT 11 
AAB74666 

ID AAB74666 standard; protein; 580 AA. 
XX 

AC AAB74666; 
XX 

DT 01-JUN-2001 (first entry) 
XX 

DE Mouse high affinity choline transporter protein. 
XX 

KW High affinity choline transporter; cho-1; Alzheimer's disease; diagnosis. 
XX 

OS Mus mus cuius. 
XX 

PN WO200116315-A1. 
XX 

PD 08-MAR-2001. 
XX 

PF 18-AUG-2000; 2000WO- JP005545 . 
XX 

PR 27-AUG-1999; 99 JP-00240642 . 

PR 27-DEC-1999; 99 JP-00368991 . 
XX 

PA (NISC- ) JAPAN SCI & TECHNOLOGY CORP. 



XX 

PI Haga T, Okuda T; 
XX 

DR WPI; 2001-226688/23. 

DR N-PSDB; AAF81713. 

XX 

PT New rat and human spinal cord high affinity choline transporters, useful 

PT in diagnosis of Alzheimer's disease and screening promoters as drugs for 

PT treating Alzheimer's disease. 
XX 

PS Claim 11; Page 82-85; 90pp; Japanese. 
XX 

CC The present sequence represents a mouse (Mus musculus) high affinity 

CC choline transporter protein designated cho-1. The cho-1 protein has 

CC nootropic and neuroprotective activities. The cho-1 polynucleotide and 

CC protein can be used for the diagnosis of diseases related to the 

CC expression of cho-1 by comparing the cho-1 polynucleotide sequence in a 

CC sample to that of a control. Drug compositions containing the cho-1 

CC protein or expression promoters or inhibitors of cho-1 are useful for 

CC treating disorders characterised by abnormal levels of cho-1, such as 

CC Alzheimer's disease 
XX 

SQ Sequence 58 0 AA; 

Query Match 94.2%; Score 2801; DB 4; Length 580; 

Best Local Similarity 92.8%; Pred. No. 6.3e-273; 

Matches 538; Conservative 23; Mismatches 19; Indels 0; Gaps 0; 

Qy 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

I : I I I I I I : I I I : I I I I I I I i I I I I I : I I I I I : II I II II I I I I I I I I I I I I I I I I I 
Db 1 MSFHVEGLVAI ILFYLLI FLVGIWAAWKTKNSGNPEEHSEAI IVGGRDIGLLVGGFTMTA 60 

Qy 61 TWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 12 0 

I II I I I II I I I I I I I I II I I I I I I I I II I II I I I I I I I I I I I I I II I I I I I I II I I I I 
Db 61 TWVGGGYINGTAEAVYGPGCGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 

Qy 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI IDVDMHI SVI I SALIATLYTLVGG 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : : I I I I : I I II I I I I I I I I 
Db 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDVNI SVIVSALIAI LYTLVGG 180 

Qy 181 LYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSW 240 

I I I II I I I I I I I I I I I : I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I : I 
Db 181 LYSVAYTDWQLFCIFIGLWISVPFALSHPAVTDIGFTAVHAKYQSPWLGTIESVEVYTW 240 

Qy 241 LDS FLLLMLGGI PWQAYFQRVLS S S SATYAQVLS FLAAFGCLVMAI PAI LI GAI GASTDW 300 

11:1111 I Mil I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I : I I I I II I I I I I I I ;: 
Db 241 LDNFLLLMLGGIPWQAYFQRVLSSSSATYAQVLSFIAAFGCLVMALPAICIGAIGASTDW 300 

Qy 301 NQTAYGLPDPKTTEEADMI LP I VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFA 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 

Db 301 NQTAYGYPDPKTKEEADMI LP I VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFA 360 

Qy 361 RNIYQLSFRQNASDKEIWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 420 

II I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II : I I I I 

Db 361 RNIYQLSFRQNASDKEIVWVMRITVLVFGASATAMALLTKTVYGLWYLSSDLVYIIIFPQ 420 



Qy 



421 



LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 48 0 



Db 



I II I I I : I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I II I I I I I II I : I I I I 
421 LLCVLFIKGTNTYGAVAGYIFGLFLRITGGEPYLYLQPLIFYPGYYSDKNGI YNQRFPFK 480 



Qy 



481 TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 540 



i i • i I i I i tiii*iiii i i i i i i i i i i i i i i i i i i i i i i i i i i i i i • i i i i i i * 

481 TLSMVTSFFTNICVSYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVRNENIKLN 540 




Db 



541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 





Db 



541 ELAPVKPRQSLTLSSTFTNKEALLDVDSSPEGSGTEDNLQ 580 



RESULT 12 
ADD50641 

ID ADD50641 standard; protein; 580 AA. 
XX 

AC ADD50641; 
XX 

DT 15-JAN-2004 (first entry) 
XX 

DE Mouse high-affinity choline transporter (mCHT) #1. 
XX 

KW Mouse; high-affinity choline transporter; mCHT; cholinergic function; 

KW Parkinson's disease; Huntington's disease; Alzheimer 1 s disease; 

KW schizophrenia; dysautonomia ; myasthenia gravis; brain; 

KW cholinergic signalling; antiparkinsonian; anticonvulsant; nootropic; 

KW neuroprotective; neuroleptic. 

XX 

OS Mus sp. 
XX 

PN US2003114399-A1. 
XX 

PD 19-JUN-2003. 
XX 

PF 23-JUL-2001; 2001US-00911077 . 
XX 

PR 23-JUL-2001; 2001US-00911077 . 
XX 

PA (BLAK/) BLAKELY R D. 

PA (APPA/) APPARSUNDARAM S. 

PA (FERG/) FERGUSON S. 

XX 

PI Blakely RD, Apparsundaram S, Ferguson S; 
XX 

DR WPI; 2003-810914/76. 

DR N-PSDB; ADD50640. 
XX 

PT Novel isolated polynucleotide encoding human or mouse high affinity 

PT choline transporter polypeptide, useful in gene therapy to increase 

PT cholinergic function in a cell of a patient suffering from Alzheimer's 

PT disease. 
XX 

PS Claim 29; SEQ ID NO 4; 74pp; English. 
XX 

CC The present invention relates to the isolation of polynucleotide 

CC sequences encoding human and mouse high-affinity choline transporter 

CC (hCHT and mCHT respectively) , and the proteins they encode. The gene 



CC encoding hCHT is located on chromosome 2ql2. The polynucleotide sequence 

CC encoding hCHT, is useful for expressing hCHT recombinantly . The hCHT 

CC polynucleotide sequence when delivered to a cell, increases cholinergic 

CC function in the cell that is in a patient having Parkinson f s disease, 

CC Huntington's disease, Alzheimer's disease, schizophrenia, dysautonomia or 

CC myasthenia gravis. The hCHT antibody is useful for controlling 

CC transporter CHT proteins to the brain, and for treating the above 

CC mentioned diseases. The antibody is also useful for diagnosing the above 

CC mentioned disorders and to detect the influence of cholinergic 

CC signalling. The present sequence represents mCHT . Note: The sequence data 

CC for this patent was obtained in electronic format directly from the USPTO 

CC web site at seqdata.uspto.gov. 

XX 

SQ Sequence 58 0 AA; 

Query Match 94.0%; Score 2795; DB 7; Length 580; 

Best Local Similarity 92.6%; Pred. No. 2.5e-272; 

Matches 537; Conservative 23; Mismatches 2 0; Indels 0; Gaps 0; 

MAFH VE G L I AI IVFYLLIL L VG I W AAW RT KN S G S AE E R S EAI I VG G RD I GL LVGG FTMT A 60 
I I I I I I I : I I I : I I I I I I I I I I I I I : I I I I I : I I I I I I I I I I I I I I I I I I I II I I I I 
MPFHVEGLVAIILFYLLIFLVGIWAAWKTKNSGNPEERSEAIIVGGRDIGLLVGGFTMTA 60 

TWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 
I II i I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I i I I I : I 
TWVGGGYINGTAEAVYGPGCGLAWAHAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFKQ 120 

I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI S VI I DVDMHI S VI I SALI AT LYTLVGG 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : : I I I I : I II I I I I I I I I I 
I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDVNI S VI VSALI AI LYTLVGG 18 0 



I I I I I I I I I I I I I II I : I I I I I I I I I I I II II I I I I II II II I I I I I I I :: I I I I : I 



L D S FL L LMLG G I P WQAY FQ RVL S S S SAT YAQVL S F LAAFG C LVMAI P AI L I GAI GAS T DW 300 

II : I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I : I I I I II I I I I I I I 

L DN FL L LMLG G I P WQAY FQ RVL S S S SAT YAQVL S F LAAFG C L VMAL P AI C I GAI GAS T DW 300 

NQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 360 
I I I I I I I I II I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

NQTAYGYPDPKTKEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 3 60 

RNIYQLSFRQNASDKEI VWVMRITVF^ 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I : I I I I 

RNI YQLS FRQNAS DKEI WVMRIT VXVFGASATAMALLTKTVYGLWYLS SDLVYI 1 1 FPQ 420 

LLCVLFVKGTNT YGAVAGYVS GLFLRI TGGEP YLYLQPLI FYPGYYPDDNGI YNQKFP FK 4 80 

I I I I I I : I I I I I I I I I I I I : I I II I I I I I I I I I I I II II II II I I I lllllhllll 

LLCVLFIKGTNTYGAVAGYIFGLFLRITGGEPYLYLQPLIFYPGYYSDKNGIYNQRFPFK 4 80 

TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 54 0 
Ihlllll I I I I : I I I I I I I I I I I I I I II I I I I I I I I I II I I I II I I I I I I : I I I I I I : 

TLSMVTSFFTNICVSYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVRNENIKLN 54 0 



Qy 


i 


Db 


i 


Qy 


61 


Db 


61 


Qy 


121 


Db 


121 


Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


301 


Db 


301 


Qy 


361 


Db 


361 


Qy 


421 


Db 


421 


Qy 


481 


Db 


481 



Qy 



541 E LAL VKP RQ SMT L S S T FTN KEAFL DVDS SPEGSGTEDNLQ 580 
III I I I I I I : I I I I I I I I I II I I I I I I I I I I I I I I I I I 



Db 541 ELAPVKPRQSLTLSSTFTNKEALLDVDSSPEGSGTEDNLQ 580 



RESULT 13 
ADD50661 

ID ADD50661 standard; protein; 580 AA. 
XX 

AC ADD50661; 
XX 

DT 15-JAN-2004 (first entry) 
XX 

DE Mouse high-affinity choline transporter (mCHT) #2. 
XX 

KW Mouse; high-affinity choline transporter; mCHT; cholinergic function; 

KW Parkinson's disease; Huntington's disease; Alzheimer f s disease; 

KW schizophrenia; dysautonomia ; myasthenia gravis; brain; 

KW cholinergic signalling; antiparkinsonian; anticonvulsant; nootropic; 

KW neuroprotective; neuroleptic. 

XX 

OS Mus sp. 
XX 

PN US2003114399-A1. 
XX 

PD 19-JUN-2003. 
XX 

PF 23-JUL-2001; 2001US-00911077 . 
XX 

PR 23-JUL-2001; 2001US-00911077 . 
XX 

PA (BLAK/) BLAKELY R D. 

PA (APPA/) APPARSUNDARAM S. 

PA (FERG/) FERGUSON S. 

XX 

PI Blakely RD, Apparsundaram S, Ferguson S; 
XX 

DR WPI; 2003-810914/76. 

DR N-PSDB; ADD50660. 
XX 

PT Novel isolated polynucleotide encoding human or mouse high affinity 

PT choline transporter polypeptide, useful in gene therapy to increase 

PT cholinergic function in a cell of a patient suffering from Alzheimer's 

PT disease. 
XX 

PS Disclosure; SEQ ID NO 24; 74pp; English. 
XX 

CC The present invention relates to the isolation of polynucleotide 

CC sequences encoding human and mouse high-affinity choline transporter 

CC (hCHT and mCHT respectively), and the proteins they encode. The gene 

CC encoding hCHT is located on chromosome 2ql2. The polynucleotide sequence 

CC encoding hCHT, is useful for expressing hCHT recombinantly . The hCHT 

CC polynucleotide sequence when delivered to a cell, increases cholinergic 

CC function in the cell that is in a patient having Parkinson's disease, 

CC Huntington's disease, Alzheimer's disease, schizophrenia, dysautonomia or 

CC myasthenia gravis. The hCHT antibody is useful for controlling 

CC transporter CHT proteins to the brain, and for treating the above 

CC mentioned diseases. The antibody is also useful for diagnosing the above 

CC mentioned disorders and to detect the influence of cholinergic 



CC signalling. The present sequence represents mCHT. Note: The sequence data 

CC for this patent was obtained in electronic format directly from the USPTO 

CC web site at seqdata.uspto.gov. 
XX 

SQ Sequence 580 AA; 

Query Match 94.0%; Score 2795; DB 7; Length 580; 

Best Local Similarity 92.6%; Pred. No. 2.5e-272; 

Matches 537; Conservative 23; Mismatches 20; Indels 0; Gaps 0 



Qy 


1 


MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 
1 1 1 1 II 1 : 1 1 1 : 1 1 1 1 1 1 II I I I I I : 1 1 1 1 1 : 1 II 1 1 1 1 1 1 1 1 1 1 1 I I I 1 1 1 1 1 1 1 1 
MP FHVEGLVAI I LFYLLI FLVGIWAAWKTKNSGNPEERSEAI I VGGRDI GLLVGGFTMTA 


60 


Db 


1 


60 


Qy 


61 


TWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 
1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 I I I 1 I I I I I I I | | | | | | | | | | | | | M | | | | | | : | 
TWGGGYINGTAEAVTGPGCGIAWAP1APIGYSLSLILGGLFFAKPMRSKGYVTMLDPFKQ 


120 


Db 


61 


120 


Qy 


121 


I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI S VI I SAL I AT L YT LVGG 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 : : 1 1 1 1 : I I I I I 1 1 1 I I I I 
I YGKRMGGLLFI PALMGEMFWAAAI F SAL GAT I SVI I DVDVNI S VI VSALI AI LYTLVGG 


180 


Db 


121 


180 


Qy 


181 


L Y S VA YT DWQ L FC I FVGL W I S VP FAL S H P AVAD I G FT AVHAK YQ K P WL GT VD S S EVY S W 
1 1 1 1 1 1 M 1 1 1 1 1 1 II : II 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 :: I I | I : | 
L YS VAYT DWQLFC I FI GLWI S VP FALS H PAVT D I GFTAVHAKYQS PWLGT I ES VEVYTW 


240 


Db 


181 


240 


Qy 


241 


L D S F L L LML GG I P WQ AY FQRVL S S S SAT YAQVL S FLAAFG C LVMAI P AI L I G AI GAS T D W 
1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : I I I 1 1 1 1 1 1 1 1 1 1 
LDN FLL LML GG I PWQAY FQRVL S S S SAT YAQVL S FLAAFGC LVMAL P AI C I GAI GAS T DW 


300 


Db 


241 


300 


Qy 


301 


NQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 
1 II 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
NQTAYGYPDPKTKEEADMI LP I VLQYLCPVYI S FFGLGAVSAAVMS SAD SSI LS AS SMFA 


360 


Db 


301 


360 


Qy 


361 


RNIYQLSFRQNASDKEIVWMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 

II 1 1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II | | | | | | | | | M | | | : | | | | 

RN I YQL S FRQNAS DKE I VWVMRI T VLVFGAS AT AMAL LT KT VYGLW YL S S DL VY 1 1 1 FP Q 


420 


Db 


361 


420 


Qy 

Db 


421 
421 


LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 
1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 : 1 1 1 1 
LLCVLFIKGTNTYGAVAGYIFGLFLRITGGEPYLYLQPLIFYPGYYSDKNGIYNQRFPFK 


480 
480 


Qy 


481 


TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 
Ihlllll 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 : 
TLSMVTSFFTNICVSYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVRNENIKLN 


540 


Db 


481 


540 


Qy 


541 


ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 
Ml 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 
ELAPVKPRQSLTLSSTFTNKEALLDVDSSPEGSGTEDNLQ 58 0 




Db 


541 





RESULT 14 
AAB74663 

ID AAB74663 standard; protein; 576 AA. 
XX 

AC AAB74 663; 
XX 



DT 01-JUN-2001 (first entry) 
XX 

DE C. elegans high affinity choline transporter protein. 
XX 

KW High affinity choline transporter; cho-1; Alzheimer's disease; diagnosis. 
XX 

OS Caenorhabditis elegans. 
XX 

PN WO200116315-A1. 
XX 

PD 08-MAR-2001. 
XX 

PF 18-AUG-2000; 2000WO- JP005545 . 
XX 

PR 27-AUG-1999; 9 9 JP- 0024 0642 . 

PR 27-DEC-1999; 99 JP-00368991 . 
XX 

PA (NISC-) JAPAN SCI & TECHNOLOGY CORP. 
XX 

PI Haga T, Okuda T; 
XX 

DR WPI; 2001-226688/23. 

DR N-PSDB; AAF81710. 
XX 

PT New rat and human spinal cord high affinity choline transporters, useful 

PT in diagnosis of Alzheimer f s disease and screening promoters as drugs for 

PT treating Alzheimer ! s disease. 
XX 

PS Claim 2; Page 62-64; 90pp; Japanese. 
XX 

CC The present sequence represents a Caenorhabditis elegans high affinity 

CC choline transporter protein designated cho-1. The cho-1 protein has 

CC nootropic and neuroprotective activities. The cho-1 polynucleotide and 

CC protein can be used for the diagnosis of diseases related to the 

CC expression of cho-1 by comparing the cho-1 polynucleotide sequence in a 

CC sample to that of a control. Drug compositions containing the cho-1 

CC protein or expression promoters or inhibitors of cho-1 are useful for 

CC treating disorders characterised by abnormal levels of cho-1, such as 

CC Alzheimer's disease 
XX 

SQ Sequence 576 AA; 

Query Match 48.9%; Score 1453; DB 4; Length 576; 

Best Local Similarity 50.5%; Pred. No. 4.7e-137; 

Matches 295; Conservative 95; Mismatches 150; Indels 44; Gaps 9 

Qy 7 GLIAIIVFYLLILLVGIWAAWRTKNSGSAEER S EAI I VGGRDI GLLVGGFTMTATW 62 

l-ll: Ihllhlllll ::|:| I : I ::: I I : I I I I I | | | I I I I 

Db 6 GIVAIVFFYVLILWGIWAGRKSKSSKELESEAGAATEEVMLAGRNIGTLVGIFTMTATW 65 

QY 63 VGGGYINGTAFLAVYVPGYGLAWAQAP I GYS LS LI LGGLFFAKPMRSKGYA/TMLDPFQQI Y 122 

M I M I I I I I t : I II I I I : I I :: I I :: I I I I I I I I : I I : I I I I I I I I 

Db 66 VGGAYINGTAEALY — NGGLLGCQAPVGYAISLVMGGLLFAKKMREEGYITMLDPFQHKY 123 

Qy 123 GKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALIATLYTLVGGLY 182 

I : I : I I I : : : I I I : I I II III I I I I I I : I 1 I : : I I : II : | | II | | II | 
Db 124 GQRIGGLMYVPALLGETFWTAAILSALGATLSVILGIDMNASVTLSACIAVFYTFTGGYY 183 



Qy 183 SVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDS-SEVYSWL 241 

: I I I I I I I I I I I I I I I I I : I I I : I II I I : I : | : 

Db 184 AVAYT D WQL FC I FVGLWVC VPAAMVH DGAKD I S RNA GDWIGEIGGFKETSLWI 237 

QY 242 DSFLLLMLGGIPWQAYFQRVLSSSSATYAQVLSFLAAFGCLVMAIPAILIGAIGASTDWN 301 

I I I I : I I I I II I I I | | I I | : I I I II I : I I I :: I I I I I I I I I : I I I 
Db 238 D CML L LVFG G I P WQVY FQ RVL S S KT AH GAQT L S FVAGVGC I LMAI P PAL I GAI ARNT D WR 297 

Qy 302 QTAYGLPDPKTTEEA DM I LP I VLQ YLCPVYI S FFGLGAVSAAVMS S ADS S I LS A 355 

M • I h : I :: I : I I I I I ::: I I I I | I | | | | | II I I I I : I I I 

Db 2 98 MTDYS PWNNGT KVE S I P PDKRNMWPLVFQYLT P RWVAFI GLGAVS AAVMS S ADS SVLS A 357 

Qy 356 S SMFARNI YQLS FRQNAS DKE I VWVMRI T VFVFGAS ATAMALLT KT VYGLWYL S S DLVYI 415 

: I I I I I I :: I : I : I I : I I :: I I I I : I I I II I : : : I I I I I I : I | I I : 
Db 358 ASMFAHN I WKLT I RPHASEKEVI I VMRIAI I CVGIMATIMALT I QS I YGLWYLCADLVYV 417 

Qy 416 VIFPQLLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQ 475 

: : I I I I II I : : : : I I I I : : I I I I ! I I : I I I I : I | | | : | : | 

Db 418 ILFPQLLCWYMPRSNTYGSLAGYAVGLVLRLIGGEPLVSLPAFFHYPMY TDGV--Q 472 

Qy 476 KFPFKTLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAW ARHSEENMDKTILV 532 

I I I : I I I : : I I : I : : I I : I I I I : I I II I I : | 

Db 473 YFPFRTTAMLSSMATIYIVSIQSEKLFKSGRLSPEWDVMGCWNIPIDHVPLPSDVSFAV 532 

Qy 533 KNENIKL DELALVKPRQSMTLSSTFTN 559 

: I : : I I I : I : I I : I 

Db 533 SSETLNMKAPNGTPAPVHPNQQPSDENTLLHPYSDQSYYSTNSN 576 



RESULT 15 
ADD50645 



ID ADD50645 standard; protein; 576 AA. 
XX 

AC ADD50645; 
XX 

DT 15-JAN-2004 (first entry) 
XX 

DE C. elegans CHOI protein. 
XX 

KW High-affinity choline transporter; CHT; cholinergic function; 

KW Parkinson f s disease; Huntington f s disease; Alzheimer's disease; 

KW schizophrenia; dysautonomia ; myasthenia gravis; brain; 

KW cholinergic signalling; antiparkinsonian; anticonvulsant; nootropic; 

KW neuroprotective; neuroleptic; CHOI. 

XX 

OS Caenorhabditis elegans. 
XX . 

PN US2003114399-A1. 
XX 

PD 19-JUN-2003. 



XX 

PF 23-JUL-2001; 2001US-00911077 . 
XX 

PR 23-JUL-2001; 2001US-00911077 . 
XX 

PA (BLAK/) BLAKELY R D. 



PA (APPA/) APPARSUNDARAM S. 

PA (FERG/) FERGUSON S. 

XX 

PI Blakely RD, Apparsundaram S, Ferguson S; 
XX 

DR WPI; 2003-810914/76. 
XX 

PT Novel isolated polynucleotide encoding human or mouse high affinity 

PT choline transporter polypeptide, useful in gene therapy to increase 

PT cholinergic function in a cell of a patient suffering from Alzheimer's 

PT disease. 
XX 

PS Disclosure; SEQ ID NO 8; 74pp; English. 
XX 

CC The present invention relates to the isolation of polynucleotide 

CC sequences encoding human and mouse high-affinity choline transporter 

CC {hCHT and mCHT respectively) , and the proteins they encode . The gene 

CC encoding hCHT is located on chromosome 2ql2. The polynucleotide sequence 

CC encoding hCHT, is useful for expressing hCHT recombinantly . The hCHT 

CC polynucleotide sequence when delivered to a cell, increases cholinergic 

CC function in the cell that is in a patient having Parkinson f s disease, 

CC Huntington f s disease, Alzheimer's disease, schizophrenia, dysautonomia or 

CC myasthenia gravis. The hCHT antibody is useful for controlling 

CC transporter CHT proteins to the brain, and for treating the above 

CC mentioned diseases. The antibody is also useful for diagnosing the above 

CC mentioned disorders and to detect the influence of cholinergic 

CC signalling. The present sequence represents Caenorhabditis elegans CHOI 

CC protein. Note: The sequence data for this patent was obtained in 

CC electronic format directly from the USPTO web site at seqdata.uspto.gov. 

XX 

SQ Sequence 57 6 AA; 



Query Match 48.9%; Score 1453; DB 7; Length 576; 

Best Local Similarity 50.5%; Pred. No. 4.7e-137; 

Matches 295; Conservative 95; Mismatches 150; Indels 44; Gaps 9 

Qy 7 GLIAIIVFYLLILLVGIWAAWRTKNSGSAEER SEAIIVGGRDIGLLVGGFTMTATW 62 

I : : I I : I I : I I I : I I I I I :: I : I I : I ::: I I : I I I I I 

Db 6 G I VAI VF F YVL I LWG I WAGRK SKSSKELES EAGAAT E E VMLAGRN I GT L VG I FTMT AT W 65 

Qy 63 VGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQIY 122 

I ! I I I I I I I I I : I II I I I : I I I I :: I I I I I I I I : I I : I I I I I I I I 

Db 66 VGGAYINGTAEALY--NGGLLGCQAPVGYAISLVMGGLLFAKKMREEGYITMLDPFQHKY 123 

QY 123 GKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALI ATLYTLVGGLY 182 

I : I : I I i : : : I I I : I I II III I I I I I I : I I I : : I I : II : | | | | | | | | | 
Db 124 GQRI GGLMYVPAL L GET FWT AAI L SAL GAT L S VI LG I DMNAS VT L SAC I AVF YT FT GG Y Y 183 

Qy 183 SVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVmKYQKPWLGTVDS-SEVYSWL 241 

: I I I I I I I I I I I I I I I I I : I I I : I II I I : I : | | : 

Db 184 AVAYTDWQLFCIFVGLWVCVPAAMVHDGAKDISRNA GDWIGEIGGFKETSLWI 237 

Qy 242 D S FL L LMLGG I P WQAY FQ RVL S S S SAT YAQVL S FLAAFGC LVMAI P AI L I GAI GAS TDWN 301 

I I I I : I I I I 11111:1 I I I I I : I I I :: I I I I Mill : I I I 

Db 238 DCMLLLVFGGIPWQWFQRVLSSKTAHGAQTLSFVAGVGCILJytAIPPALIGAIARNTDWR 297 



Qy 



302 QTAYGLPDPKTTEEA DMI LP I VLQYLC PVYI S FFGLGAVSAAVMS S ADS S I LS A 355 



i i • i i - . i . . i . i i i i i . . . , i i i i i i i m i i i i i i i : i i i 

Db 298 MTDYSPWNNGTKVESIPPDKRNMWPLVFQYLTPRWVAFIGLGAVSAAVMSSADSSVLSA 357 

Qy 356 SSMFARNIYQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYI 415 

= 1111 M::|: I : I I : I I : : I I I I : I Mill ::: :||||: 

Db 358 ASMFMNIWKLTIRPHASEKEVIIVMRIAIICVGIMATIMALTIQSIYGLWYLCADLVW 417 

QY 416 VIFPQLLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQ 475 

: : I I I I I I I : : : : I I I I : : I I I I I I I : I I M : I III : I : I 

Db 418 ILFPQLLCWYMPRSNTYGSLAGYAVGLVLRLIGGEPLVSLPAFFHYPMY TDGV — Q 472 

Qy 476 KFPFKTLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAW ARH S EENMDKT I LV 532 

I I I : I I I : : I I : I : : I I : I I I I : I I II I I : | 

Db 473 YFPFRTTAMLSSMATIYIVSIQSEKLFKSGRLSPEWDVMGCWNIPIDHVPLPSDVSFAV 532 

Qy 533 KNENIKL DELALVKPRQSMTLSSTFTN 559 

: I : : II h I : I I : I 

Db 533 SSETLNMKAPNGTPAPVHPNQQPSDENTLLHPYSDQSYYSTNSN 57 6 



Search completed: September 28, 2004, 17:05:47 
Job time : 131 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence: 

Scoring table: 



September 28, 2004, 17:03:49 ; Search time 34 Seconds 

(without alignments) 
880.678 Million cell updates/sec 

US-10-069-541-6 
2972 

1 MAFHVEGLIAIIVFYLLILL EAFLDVDSSPEGSGTEDNLQ 580 

BLOSUM62 

Gapop 10.0 , Gapext 0.5 



389414 



Searched: 389414 seqs, 51625971 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : Issued_Patents_AA: * 

1: /cgn2_6/ptodata/2/iaa/5A_COMB.pep: * 

2 : /cgn2_6/ptodata/2/iaa/5B_COMB.pep: * 

3: /cgn2_6/ptodata/2/iaa/6A_COMB.pep: * 

4 : /cgn2_6/ptodata/2/iaa/6B_COMB.pep:* 

5: /cgn2_6/ptodata/2/iaa/PCTUS_COMB.pep:* 

6 : / cgn2_6/ptodata/2/iaa/backf ilesl . pep : * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 
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4. 


4 


499 


4 


us- 
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991A-27018 
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27018, A 
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039A-8830 
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40 
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134- 
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Sequence 
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252- 


991A-27682 
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27682, A 


42 
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4. 


1 
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09- 
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001C-3001 


Sequence 


3001, Ap 


43 


122.5 


4. 


1 


526 


4 


us- 


09- 


134- 


000C-4715 
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44 
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4 
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4 


us- 


09- 


198- 


452A-1105 


Sequence 


1105, Ap 



ALIGNMENTS 



RESULT 1 
US-09-657-252-2 

Sequence 2, Application US/09657252 
Patent No. 6500643 
GENERAL INFORMATION: 
APPLICANT: Wu, Dong-Hai 
APPLICANT: Gu, Yunrong 
APPLICANT: Millard, William 
APPLICANT: He, Yun-Je 

TITLE OF INVENTION: Human High Affinity Choline Transporter cDNA 
FILE REFERENCE: MBHB00-639 
CURRENT APPLICATION NUMBER: US/ 09/ 657 , 252 
CURRENT FILING DATE: 2000-09-07 
NUMBER OF SEQ ID NOS : 6 
SOFTWARE: Patentln version 3.0 
SEQ ID NO 2 
LENGTH: 580 
TYPE: PRT 



; ORGANISM: Homo sapiens 
US-09-657-252-2 



Query Match 100.0%; Score 2972; DB 4; Length 580; 

Best Local Similarity 100.0%; Pred. No. 1.6e-281; 

Matches 58 0; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 


1 


MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 


60 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1 


MAFHV^GLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 


60 


Qy 


61 


TWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 


120 






1 II II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




JJJD 


61 


TWVf^^r^YTM^TAFAVYVP^YOT.AWAOAPTGYSTiSLTLGGLFFAKPMRSKGYVTMT.DPFOO 


120 


Qy 


121 


I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMH I S VI I SAL I AT L YT LVGG 


180 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 






121 


T YGKRMGGT.T.FT PALMGEMFWAAAI FSALGATI SVI ID VDMHI SVI I SAL I AT LYT LVGG 


180 


Qy 


181 


LYSVAYTDWQLFCIL^GLWISVPFALSHPAVADIGFTAV11AKYQKPWLGTVI)SSEW 


240 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 






181 


L Y S VAYT D WO L FC I FVGLW I S VP FAL S H P AVAD I G FT AVHAK YO K P WL GT VD S S EVY S W 

i J x, V X -L ' V V X_l X. V_/- _L x> V \JXJ V f _L fcJ V 1_ x. xU-J k> XXX. X I v xixy -X_ \j x» x, x ^ v x uu v -L. ^y/ x v x, » • xj _l. v x-/ J-J v x- ^ r ■ 


240 


Qy 


241 


LDS FLLLMLGGI PWQAYFQRVL S S S S AT YAQVL S FLAAFGC LVMAI PAI LI GAI GASTDW 


300 






1 1 1 1 1 1 1 1 1 1 1 1 1! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




UD 


9 41 


T.n^FT.T.T ,MT .GCT PWD A Y FO R VT ■ S S S S AT Y AO VL S FLAAFGC LVMAI PAI L I GAI GAS T DW 

XjXV *J C Xj Xj 1 ill 1 1 \J \J _L X Vf yA X X ^ i\ V XJ kJ »J iJ JA _L _L .TVy/ V J— 1 *~J J_ ilrulL VJV J— 1 V 1 £ ix-L J-J JL VJiiX sJiitJ X i-' »» 


300 


Qy 


301 


NQTAYGLPDPKTTEEADMI LPI VLQYLCPVYI S FFGLGAVSAAVMS SADSSI LSAS SMFA 


360 






1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




nk 


O U X 


NDTAYHT PnPKTTFFADMTT.PTVLOYLCPVYT S FFGLGAVSAAVMS SADSSI LSAS SMFA 


360 


Qy 


361 


RN I YQL S FRQNAS DKE I VWVMRI T VFVFGAS AT AMAL LT KT VYGLWYLSSDLVYIVIFPQ 


420 






II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 




Db 


361 


RNIYQLSFRQNASDKEIVWMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 


420 


Qy 


421 


LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 


480 






1 1 1 1 1 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 II 1 1 II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


421 


LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 


480 


Qy 


481 


TLAMVTSFLTNICISYLAKYLFESGTLPP 


540 




II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


481 


TLAMWSFLTNICISYiyU<YLFESGTLPPKLDVFDAW^ 


540 


Qy 


541 


ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 58 0 








1 I I I I I I I II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 




Db 


541 


ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 58 0 




RESULT 2 








US-07-841- 


-651- 


-4 




; Sequence 4, 


Application US/07841651 





Patent No. 5410031 
GENERAL INFORMATION: 

APPLICANT: Pajor, Ana M 
APPLICANT: Wright, Ernest M 

TITLE OF INVENTION: Cloning and Functional Expression of a 



; TITLE OF INVENTION: Mammalian Na+/Nucleoside Cotransporter : A Member of 
the 

; TITLE OF INVENTION: SGLT Family 
NUMBER OF SEQUENCES: 4 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Sheldon & Mak 
; STREET: 225 South Lake Avenue, Ninth Floor 

; CITY: Pasadena 

; STATE: California 

COUNTRY: USA 
ZIP: 91101 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
; OPERATING SYSTEM: PC-DOS/MS-DOS 

; SOFTWARE: Patentln Release #1.0, Version #1.25 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/07/841,651 
FILING DATE: 19920224 
CLASSIFICATION: 435 
ATTORNEY/AGENT INFORMATION: 
; NAME: Mandel, SaraLynn 

REGISTRATION NUMBER: 31,853 
REFERENCE/DOCKET NUMBER: 8772 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (818) 796-4000 
TELEFAX: (818) 795-6321 
INFORMATION FOR SEQ ID NO: 4: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 662 amino acids 
TYPE: AMINO ACID 
; TOPOLOGY: linear 

; MOLECULE TYPE: protein 

HYPOTHETICAL: NO 
; ORIGINAL SOURCE: 

; ORGANISM: Oryctolagus cuni cuius 

US-07-841-651-4 



Query Match 10.4%; Score 308.5; DB 1; Length 662; 

Best Local Similarity 23.4%; Pred. No. 3.2e-21; 

Matches 154; Conservative 110; Mismatches 238; Indels 155; Gaps 26 

Qy 11 IIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGYING 70 

I : : : : I : : : II : I I : I I I : : I I : I : : I : : I I : I 

Db 32 IVIYFLWMAVGLWAMFST-NRGTV GGFFLAGRSMVWWPIGASLFASNIGSGHFVG 86 



Qy 71 TAEAVYVPGYGLAWAQAPIGYS LSLILGGLFFAKPMRSKGYVTMLDPFQQI Y-GK 124 

I III ||: : ::|| :| : I : I I I I : I : : I I 

Db 87 LA GTGAASGIATGGFEWNALIMVWLGWVFVPI YIRA-GVVTMPEYLQKRFGGK 139 



Qy 125 RMGGLLFI PALMGEMFW — AAAI FSALGAT- 1 SVI I DVDMHI SVI I SALIATLYTLVGGL 181 

I : I I : I : : I : I I I I II I : : : | : : : : : I | : I I I I : I I I 
Db 140 RIQIYLSILSLLLYIFTKISADIFS— GAIFIQLTLGLDIYVAIIILLVITGLYTITGGL 197 



Qy 182 Y S VAYT DWQL FC I FVGLW I S VP FAL SH PAVAD I G FTAVHAK Y ~ Q 225 

: I I II : I : II I II I : I II I 

Db 198 AAVIYTDTLQTAIMMVGSVILTGFAFHEVG GYEAFTEKYMRAI PSQI SYGNTSI PQ 253 



Qy 226 KPWLGTVDSSEVYSWLDSFLLLMLGGIPW QAY FQRVL S S S S A 267 

I : I : : I : I I I I I INI:: 

Db 254 KCYTPREDAFHI FRDAITGDI PWPGLVFGMSILTLWYWCTDQVIVQRCLSAKNL 307 

Qy 268 T YAQ VL S FLAAFGC LVMAI P AI L I G AI GAS T DWNQT AY GL P D P KTTEEADMILP 321 

: : : I : : : : : i | : : i | : | • ; | 

Db 308 SHVKAGCILCGYLKVMPMFLIVMMGMVSRILYTDKVACWPSECERYCGTRVGCTNIAFP 367 

Qy 322 I VLQYLCPVYI S FFGLGAVSAAVMS S ADS S I L SAS SMFARNI YQLS FRQNAS DKEI VWVM 381 

: : II : I : I : : I I I I I I I : : I : I I I : II : I I : : 
Db 368 TLVVELMPNGLRGLMLSVMMASLMSSLTSIFNSASTLFTMDIY-TKIRKICAS 426 

Qy 382 RI-TVFVFGASATAMALLTKTVYG — LWYLSSDLVYI — VI FPQLLCVLFVKGTNTYGAV 436 

I : : I : I I : :: I I : I I : I I : I I I I I 

Db 427 RLFMLFLIGISIAWVPIVQSAQSGQLFDYIQSITSYLGPPIAAVFLLAIFWKRVNEPGAF 486 

Qy 437 AGYVSGLFLRI TG GEPYLYLQPLIFYPGYYPDDNGI Y 473 

111:1 II I I I I :: I 
Db 487 WGLVLGFLIGISRMITEFAYGTGSCMEPSNCPTIICGVHYLYFAIILF 534 

Qy 474 NQKFPFKTIAMVTSFLTNICISYL^YLFESGTLPPKLDVFDAWA-RHSEENMDKTILV 532 

I I : I : : I I : I : : : : I : I : I 
Db 535 VISIITWWSLFTKPI PDVHLYRLCWSLRNSKE 568 

Qy 533 KNENIKLD — ELALVKPRQSMTLSSTFTNKEAF LDVDSSPEGSGTED 577 

I I I I I : : : I : I : I I I I I : : I : 

Db 569 — ERIDLDAGEEDIQEAPEEATDTEVPKKKKGFFRRAYDLFCGLDQDKGPKMTKEEE 623 



RESULT 3 

US-09-252-991A-24 099 

; Sequence 24099, Application US/09252991A 
; Patent No. 6551795 
; GENERAL INFORMATION: 

; APPLICANT: Marc J. Rubenfield et al . 

; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
PSEUDOMONAS 

; TITLE OF INVENTION: AERUGINOSA FOR DIAGNOSTICS AND THERAPEUTICS 
; FILE REFERENCE: 107196.136 

; CURRENT APPLICATION NUMBER: US/ 09/252 , 991A 

; CURRENT FILING DATE: 1999-02-18 

; PRIOR APPLICATION NUMBER: US 60/074,788 

; PRIOR FILING DATE: 1998-02-18 

; PRIOR APPLICATION NUMBER: US 60/094,190 

; PRIOR FILING DATE: 1998-07-27 

; NUMBER OF SEQ ID NOS : 33142 

; SEQ ID NO 24099 

LENGTH: 494 

TYPE: PRT 

; ORGANISM: Pseudomonas aeruginosa 

FEATURE: 
; NAME/ KEY: UNSURE 
; LOCATION: (232) 

OTHER INFORMATION: Identity of amino acid at the above locations are 
unknown . 

US-09-252-991A-24099 



Query Match 10.1%; Score 301; DB 4; Length 494; 

Best Local Similarity 24.9%; Pred. No. l.le-20; 

Matches 116; Conservative 82; Mismatches 207; Indels 60; Gaps 15 

Qy 9 IAIIVFYLLILLVGI WAAWRTKNSGSAEERS EAI I VGGRDI GLLVGGF TMTAT 61 

: I : : I : I I I : I I I : I : : I I I : : I II I I I I 

Db 32 MAL D I FWL I YAAGM I AL GW YGMR RAKTRDD-YLVAGRNLG PGFYLGTMAAT 82 

Qy 62 WGGGYINGTAEAVTVPGYGIAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQI 121 

: I I II 1 I I I III:: Mill: I : : : 

Db 83 VLGGASTIGTVRLGYVHGISGFWLCGAIG — LGIVGLSLFLAKPLLKLKI YTVTQVLERR 140 

Qy 122 YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALI ATLYTLVGGL 181 

I : I : : I I : I : I : : : I : : I : I I : : I I : 

Db 141 YN P AARHAS AL I MLVYALMI GAT S T I AI GT VMQVL FGL P FWVS I L I GGGVWL YS T I GGM 200 

Qy 182 YS VAYT D WQL FC I FVGL- W I S VP FAL S H P AVAD IGFTAVHAKYQKPWLG 230 

: I : I I : I I : I I I :: : I : : : I III: I 
Db 201 WSLTLTDIVQFLIMTVGLVFLLMPLSINDAGXWDALVAKLPASYFDFTAI GW — 252 

Qy 231 TVDS S EVYSWLDS FLLLMLGGI PWQAYFQRVLS S S SAT YAQVLS FLAAFGCLVMAI PAI L 290 

I : I II: I I : I I I : : I I I : I I I : : : I 

Db 253 — DTIVTY FLI YFFGIFIGQDIWQRVFTARSETVAKVAGSAAGI YCVLYGMAGAL 305 

Qy 291 I GAI GAS T DWNQT AYGL PD P KTT E EADMI L P I VLQ YLC P VY I S FFGLGAVS AAVMS SAD S 350 

II III I : | : : : | | : | | | : | | : | : 

Db 306 IGMAAKVL LPD LENVNNAFAS WEHS LPNGI RGLVI AAALAALMS TASA 354 

Qy 351 SILSASSMFARNIY-QLSFRQNASDKEIvl^4RITVFVFGASATAMALLTKTWGLWYLS 4 09 

: I : I I : : : : : I : I I I II : I : I I : I : : 

Db 355 GLIAASTTVTQDLLPRLRRGRGQSDNGDVHENRIATLLLGLWLGIALWSDVISALTVA 414 

Qy 410 SDLVYI VI FPQLLCVLFVKGTNTYGAVA GYVSGLFLRITGG 450 

: I : : I : : : I III: I : : : I II 

Db 415 YNLLVGGMLIPLIGAI YWKRATTAGAITSMTLGFLTVLVFMIKDG 459 



RESULT 4 

US-10-162-012-27 

; Sequence 27, Application US/10162012 

; Patent No. 6682597 

; GENERAL INFORMATION: 

; APPLICANT: Curtis, Rory A.J. 

; APPLICANT: Silos-Santiago, Inmaculada 

; APPLICANT: Gu, Wei 

; TITLE OF INVENTION: NOVEL HUMAN ION CHANNEL AND TRANSPORTER FAMILY MEMBERS 

; FILE REFERENCE: 10448-190001 

; CURRENT APPLICATION NUMBER: US/10/162 , 012 

; CURRENT FILING DATE: 2002-06-04 

; PRIOR APPLICATION NUMBER: US 60/209,845 

; PRIOR FILING DATE: 2000-06-06 

; PRIOR APPLICATION NUMBER: US 09/875,321 

; PRIOR FILING DATE: 2001-06-06 

; PRIOR APPLICATION NUMBER: PCT/US01/18340 

; PRIOR FILING DATE: 2001-06-06 

; PRIOR APPLICATION NUMBER: US 60/209,257 



; PRIOR FILING DATE: 2000-06-05 

; PRIOR APPLICATION NUMBER: US 09/875,423 

; PRIOR FILING DATE: 2001-06-05 

; PRIOR APPLICATION NUMBER: PCT/US01/ 18398 

; PRIOR FILING DATE: 2001-06-05 

; PRIOR APPLICATION NUMBER: US 60/209,238 

PRIOR FILING DATE: 2000-06-05 
; PRIOR APPLICATION NUMBER: US 09/875,363 
; PRIOR FILING DATE: 2001-06-05 
; PRIOR APPLICATION NUMBER: PCT/US01/ 18247 
; PRIOR FILING DATE: 2001-06-05 
; PRIOR APPLICATION NUMBER: US 60/227,068 
; PRIOR FILING DATE: 2000-08-22 
; PRIOR APPLICATION NUMBER: US 09/928,530 
; PRIOR FILING DATE: 2001-08-13 

PRIOR APPLICATION NUMBER: PCT/US01/25475 
; PRIOR FILING DATE: 2001-08-15 
; PRIOR APPLICATION NUMBER: US 60/226,770 
; PRIOR FILING DATE: 2000-08-21 
; PRIOR APPLICATION NUMBER: US 09/934,421 
; PRIOR FILING DATE: 2001-08-21 
; PRIOR APPLICATION NUMBER: PCT/US01/26096 
; PRIOR FILING DATE: 2001-08-21 
; PRIOR APPLICATION NUMBER: US 60/279,281 
; PRIOR FILING DATE: 2001-03-28 
; PRIOR APPLICATION NUMBER: US 10/109,029 
; PRIOR FILING DATE: 2002-03-28 
; PRIOR APPLICATION NUMBER: PCT/US02/09728 
; PRIOR FILING DATE: 2002-03-28 
; PRIOR APPLICATION NUMBER: US 60/290,288 
; PRIOR FILING DATE: 2001-05-11 

PRIOR APPLICATION NUMBER: US (not assigned) 
; PRIOR FILING DATE: 2002-05-13 
; NUMBER OF SEQ ID NOS : 4 8 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 27 

LENGTH: 675 
; TYPE: PRT 

; ORGANISM: Homo sapiens 
US-10-162-012-27 

Query Match 10.0%; Score 2 98.5; DB 4; Length 675; 

Best Local Similarity 22.7%; Pred. No. 3.2e-20; 

Matches 149; Conservative 115; Mismatches 238; Indels 155; Gaps 29 

Qy 2 AFHVEGL IAIIVFY-LLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGF 56 

II : I I I I : : I I I : I I I : I : : I I : : I : I I 

Db 18 AFPQKGLEPGDIAVLVLYFLFVLAVGLWSTVKTKR DTVKGYFLAEGNMVWWPVGA- 72 

Qy 57 TMTATWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLS LILGGLFFAKPMRSKGY 111 

: : I : I I I : I I III : I h I : I : I I : I 

Db 73 SLFASNVGSGHFIGLA GSGAATGI SVSAYELNGLFSVLMLAWI FL- - PI YI AGQ 124 

Qy 112 VTMLDPFQQIYGKRMGGLLFIPALMGEMFWAAAIFSALGATI SVIID VDMHIS 164 

||::: I I I I : I I : : : : I I : : : : : I : | : : : : 

Db 125 VTTMPEYLR KRFGGI R-I PI I LAVLYLFI YI FTKI SVDMYAGAI FIQQSSHLDLYLA 180 



Qy 165 VI I SALIATLYTLVGGLYSVAYTDVVQLFCI FVGLWI SVPFALSHPAVADIGFTAVHAKY 224 

: : | : I I : I I I : I I I I : I : : I ' : I I I ' : I I 

Db 181 I VGLLAI TAVYTVAGGLAAVI YT DALQTL IMLI GALTLMG Y — S FAAVG- - GMEGLKEKY 236 



Qy 

Db 



225 QKPWLGTVDSSEVYS-WLDSFLLLMLGGI 252 

Ml: : I I 

237 FLALASNRSENSSCGLPREDAFHIFRDPLTSDLPWPGVLFGMSIPSLWY 285 



Qy 253 P W Q AY FQ RVL S S S SAT YAQVL S FLAAFGC LVMAI PAI L I GAI GAS T DWN QT AYGL P D 309 

| | | | | : : : : : I : : : I I : : : : I = HI I 

Db 28 6 -WCTDQVIVQRTIAAKNLSH^ 342 

Qy 31 o PKTTEE ADMI LPI VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFAR 361 

|: :: :|: I :: II : : : 11 = 11 I I I Ms: I 

Db 343 PEICQKICSNPSGCSDIAYPKLVLELLPTGLRGLMMAVMVAALMSSLTSIFNSASTIFTM 402 

Qy 362 N I YQLS FRQNAS DKEI VWVMRI TVFVFGAS ATAMALLTKTVYGLW Y 4 07 

: : : I I I : I I : : I I : I II IN 1 

Db 403 DLWN-HLRPRASEKELMIVGRVFV LLLVLVS I LWI PWQAS QGGQLFI Y 450 

Q y 408 LSSDLVYI VI FPQLLCVLFVKGTNT YGAVAGYVSGLFLRI TG-GEPYLYLQPLI F 461 

: | | : | : | : I I I I I I I : I M I : : : I : I I 

Db 451 IQS I SS YLQPPVAWF IMGCFWKRTNEKGAFWGLI SGLLLGLVRLVLDFI YVQPRC- 506 

Q y 462 YPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLAKYLFESGTLPPKLDV 513 

||: : : : : I : I : I I : I : : : I I I : : 

Db 507 DQPDERPVLVKSIHYLYFSMILSTVTLITVSTVSWF TEPPSKEMVSHLTWFT 558 

Qv 514 -FDAWARHSEENMDKTILVKNENIKLD ELALVKPRQSMTLSSTFTNKEA 562 

|||: | ::| : : I I I I |:: 

Db 559 RHDPWQKEQAPPAAPLSLTLSQNGMPEASSSSSVQFEMVQENTSKTHSCDMTPKQS 615 



RESULT 5 

US-09-540-236-2193 

; Sequence 2193, Application US/09540236 
; Patent No. 6673910 
; GENERAL INFORMATION: 

APPLICANT: Gary L. Breton et al . 
; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
MO RAX ELLA CATARRHAL I S 

; TITLE OF INVENTION: FOR DIAGNOSTICS AND THERAPEUTICS 
; FILE REFERENCE: 2709.2005-001 

CURRENT APPLICATION NUMBER: US/ 09/ 54 0 , 236 
; CURRENT FILING DATE: 2000-04-04 
; NUMBER OF SEQ ID NOS : 3840 
; SEQ ID NO 2193 

LENGTH: 521 
; TYPE: PRT 

ORGANISM: M. catarrhalis 
US-09-540-236-2193 

Query Match 10.0%; Score 298; DB 4; Length 521; 

Best Local Similarity 23.4%; Pred. No. 2.4e-20; 

Matches 131; Conservative 103; Mismatches 212; Indels 114; Gaps 



9 I AI I VFYLLI LLVGIWAAWRTKNS GSAEERSEAI I VGGRDI GLLVGGFTMTATWVGGGYI 68 



I : : I : : : I : : : I I : I : : t I I : : I I I : : I : I : : I : 

Db 33 I S LAVYFI LMI AI GI YAYFKQKND IEGYMLGGRNLSPAVTALSAGAS DMSGWLL 8 6 

Qy 69 NGTAEAVYVPGYGLAWAQAPIGYSLSLILGG LFFAKPMR SKGYVTMLDPFQ 119 

| : I I I I I : i I : I I I : I : I I : I I 

Db 87 LG LPGYMYAS GVVSIWIALGLTIGACANYLIVAPRLRVYTELADNAVTLPDYFS 140 

Qy 120 QI YGKRMGGLLFI PALMGEMF WAAAI F SAL GAT I S VI I DVDMHI SVI I SALI ATLYT 176 

: : | : | : : : I : 1 I II : : : : : : : I : II 

Db 141 NRFHDKSHLLRIMSAWIILFFTVYTAAS LVAGGKLFESSLNLSYSMGLWVTAGVWAYT 200 

Qy 177 L VGGL Y S VAYT D WQL FC I FVGLWI S VP FAL S H PAVAD I GFT AVHAK YQK P WL GT VD S S E 236 

I | | : I : I I I I : : : I I : I : : : I : : I 

Db 201 LFGGFLAVSLTDFVQGVIMLIAMLI VP WAFGEI GGVS EAMAI ATQTNT E 250 

Qy 237 VYSWLDS FLLLMLGGI PWQAY FQRVLSS S SAT YAQVLS FLAAFGCLV 283 

| : : I : : : : : I I I : I : I I h I ' ' 

Db 251 VFNWMNG — WVMGVISLMAWGFGYFGQPHIIVRFMAIRSVKDVPTAMVI GMGWMI 304 

Qy 284 MA-IPAILIGAIGASTDWNQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSA 342 

:: | I ::: I I : : I I I I : I I : II I I I I I : I 

Db 305 LSLIGALMVGLAGIAY-VARTGIELKDPET 1 FLVFSQVLFHPLI SGFLLAAI LA 357 

Qy 343 AVMS SAD S S I L S AS SMFARN I YQ L S FRQNAS DKE I VWVMRI T VFVFGAS ATAMA 396 

I : I I : I : I II I : I I : I : II- I : I I : I : I : I 
Db 358 AIMSTI S SQLLWS S SLTRDI YKLFLDKQASEARQVLI GRI S WLVAI I AIMLAGDSNS S 417 

Qy 397 LLTKTVYGLWYLSSDLVYIVIFPQLLCVLFVKGTNTYGAVAGYVSG LFLRITGG 450 

| : : | I : : I I I M : I I : I : : : I I 

Db 418 VLN LVS HAWAG FG AAFGPLVILSLMWKRMNRNGALAGMIVGALTVIIWVYGG 469 

Qy 451 EPYLYLQPLIFYPGYYPDDNGIYN — QKFPFKTLAMVTSFLTNICISYLAKYLFESGTLP 508 

I I I : : I I : II I : I I : I : I I 

Db 470 FEIGGQPANDAIYSILPGFAF SLVTTIAVSLM TAP 504 

Qy 509 P KL D VFD AWARH S E ENMD K 528 

I : : I : I : I 

Db 505 PPVYIVQKF EDMEK 518 



RESULT 6 
US-07-841-651-2 

; Sequence 2, Application US/07841651 

; Patent No. 5410031 

; GENERAL INFORMATION: 

; APPLICANT: Pajor, Ana M 

; APPLICANT: Wright, Ernest M 

; TITLE OF INVENTION: Cloning and Functional Expression of a 

; TITLE OF INVENTION: Mammalian Na+/Nucleoside Cotransporter : A Member of 

the 

TITLE OF INVENTION: SGLT Family 
NUMBER OF SEQUENCES: 4 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Sheldon & Mak 

STREET: 225 South Lake Avenue, Ninth Floor 
CITY: Pasadena 
; STATE: California 



COUNTRY: USA 
ZIP: 91101 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/07/841,651 
FILING DATE: 19920224 
CLASSIFICATION: 435 
ATTORNEY/AGENT INFORMATION: 
NAME: Mandel, SaraLynn 
REGISTRATION NUMBER: 31,853 
REFERENCE/ DOCKET NUMBER: 8772 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (818) 796-4000 
TELEFAX: (818) 795-6321 
INFORMATION FOR SEQ ID NO: 2: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 672 amino acids 
TYPE: AMINO ACID 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
US-07-841-651-2 

Query Match 10.0%; Score 298; DB 1; Length 672; 

Best Local Similarity 25.0%; Pred. No. 3.5e-20; 

Matches 153; Conservative 89; Mismatches 232; Indels 138; Gaps 25; 

Qy 9 IAII-VFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGY 67 

I I : I :: I I : : I I : I : Mil: : I I : | : : | : : I I : 

Db 26 IAVIAAYFLLVIGVGLWSMCRT-NRGTV GGYFLAGRSMVWWPVGASLFASNIGSGH 80 

Qy 68 I NGTAEAVYVP GYGLAWAQAP I G Y S L S LILGGLFFAKPMRSKGYVTMLDPFQQIYG 123 

II I I I II:: : : I I I I : I : I I I 
Db 81 FVGLA GT GAAN GLAVAG FEWN AL FWL L L GW L FAP VYLT AGVI TM PQYLR 130 

Qy 124 KRMGG LLFI PALMGEMFWAAAI F — SALGATI SVI I DVDMHI SVI I SA 169 

I I I I I : I : : : | : | | | | | : I II 

Db 131 KRFGGHRI RLYLSVLS LFLYI FTKI SVDMFSGAVFIQQALGWNI YASVIALL 182 

Qy 170 L I AT L YT LVGGL Y S VAYT DWQ L FC I FVGLW I S VP FAL S H P AVAD I G FT AVHAK Y 224 

I : I I : I I I :: I I I I I I I I : I : I | : : : I I 

Db 183 GI TMVYTVT GGLAALMYT DT VQT FVI I AGAF I LT G YAFHEVG GYSGLFDKYMGAMT 238 

Qy 225 QKPWLGTVDSSEVYSWLDSFLLL MLGGIPW QAYF 258 

: I : I : I I I I : I I : I : I I I 

Db 239 SLTVSEDPAVGNISSSCYRPRPDSYHLLRDPVTGDLPWPALLLGLTIVSGWYWCSDQVIV 298 

Qy 259 QRVL S S S SAT YAQVL S FLAAFG C LVMAI P AI L I GAI GAST DWNQT AYGL P D P KT TE 314 

I I I : : I : : I : I ■:: I I : ' I I : I I 

Db 299 QRCLAGRNLTHIKAGCILCGYLKLTPMFLMVMPGMISRILYPDEVACVAPEVCKRVCGTE 358 

Qy 315 E- -ADMI LP I VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFARN I YQLS FRQNA 372 

: : : I : : I I : I : M : I I I I I : I : : I : I I I I I 
Db 359 VGCSNIAYPRLWKLMPNGLRGLMLAVMLAALMSSLASIFNSSSTLFTMDIYTL — RPRA 416 



Qy 373 SDKEIVWVMRITVFVFGASATAMALLTKTVYG LWYLSSDLVYIV--IFPQLLCVLFV 427 

: I : : I I : I I : I : : I I : I I : = =111 

Db 417 GEGELLLVGRLWWFIVAVSVAWLPWQAAQGGQLFDYIQSVSSYLAPPVSAVFWALFV 476 

Qy 42 8 KGTNTYGAVAGWSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMV-- 4 85 

I II I : I I : : I : I I I : I 

Db 4 77 PRVNEKGAFWGLIGGLLMGLARLIP EFSFGTGSCVRP 513 

Qy 486 TSFLTNICISYLAKYLFE-SG TLP-PKLDVFDAWA-RHSEENMDKTI 530 

: I I : I I I I I I III:: I : I I I : I 
Db 514 SACPAFLCRVHYLYFAIVLFFCSGLLIIIVSLCTAPIPRKHLHRLVFSLRHSKE 567 

Qy 531 LVKNENI KLDEL 542 

: I : : III 
Db 568 - - EREDLDADEL 577 



RESULT 7 
US-07-841-651-3 

Sequence 3, Application US/07841651 
Patent No. 5410031 
GENERAL INFORMATION: 

APPLICANT: Pajor, Ana M 
APPLICANT: Wright, Ernest M 

TITLE OF INVENTION: Cloning and Functional Expression of a 
TITLE OF INVENTION: Mammalian Na+/Nucleoside Cotransporter : A Member of 
the 

TITLE OF INVENTION: SGLT Family 
NUMBER OF SEQUENCES: 4 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Sheldon & Mak 

STREET: 225 South Lake Avenue, Ninth Floor 
CITY: Pasadena 
STATE: California 
COUNTRY: USA 
ZIP: 91101 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 07/ 84 1 , 651 
FILING DATE: 19920224 
CLASSIFICATION: 435 
ATTORNEY/AGENT INFORMATION: 
NAME: Mandel, SaraLynn 
REGISTRATION NUMBER: 31,853 
REFERENCE/ DOCKET NUMBER: 8772 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (818) 796-4000 
TELEFAX: (818) 795-6321 
INFORMATION FOR SEQ ID NO: 3: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 672 amino acids 
TYPE: AMINO ACID 



TOPOLOGY: linear 

MOLECULE TYPE: protein 

HYPOTHETICAL: NO 

ORIGINAL SOURCE: 
; ORGANISM: Oryctolagus cuniculus 

US-07-841-651-3 



Query Match 10.0%; Score 298; DB 1; Length 672; 

Best Local Similarity 25.0%; Pred. No. 3.5e-20; 

Matches 153; Conservative 89; Mismatches 232; Indels 138; Gaps 25; 

Qy 9 IAI I -VFYLLILLVGIWAAWRTKNSGSAEERSEAI IVGGRDI GLLVGGFTMTATWVGGGY 67 

I I : I : : | | : : | | : | : Mil: : II : | : : I : : | I : 

Db 26 IAVIAAYFLLVIGVGLWSMCRT-NRGTV GGYFLAGRSMVWWPVGAS LFASNI GS GH 80 

Qy 68 I N GT AEAVYVP G YGLAWAQ AP I G Y S L S LI LGGLFFAKPMRS KG YVTMLDP FQQI YG 123 

|| Ml II:: : : I I I I : I : I I I 
Db 81 FVGLA GT GAAN G LAVAG F EWNAL FWL L L GW L FAP VY LT AGVI TM PQYLR 130 

Qy 124 KRMGG LLFI PALMGEMFWAAAI F — S ALGAT I SVI I DVDMHI SVI I S A 169 

I I I I I : I : : : I : : IN 

Db 131 KRFGGHRI RLYLSVLSLFLYI FTKI SVDMFS GAVFIQQALGWNI YASVIALL 182 

Qy 170 L I AT L YT LVGGL Y S VAYT D WQL FC I FVGLWI S VP FAL S H P AVAD I GFT AVHAK Y 224 

I : I I : I I I : : I I I I I i I I : I : I | : : : | | 

Db 183 GI TMVYT VT GGLAALMYT DT VQT FVI I AGAF I LT G YAFH EVG GYSGLFDKYMGAMT 238 

Qy 225 QKPWLGTVDS SEVYSWLDS FLLL MLGGIPW QAYF 258 

: I : I : I I I I : I I : I : I I I 

Db 239 SLTVSEDPAVGNISSSCYRPRPDSYHLLRDPVTGDLPWPALLLGLTIVSGWYWCSDQVIV 298 

Qy 259 QRVLSSSSATYAQVLSFLAAFGCLVMAI PAILIGAIGASTDWNQTAYGLPDPKT TE 314 

| I I : : I : : I : I : : I I :: I I : I I 

Db 299 QRCLAGRNLTHI KAGCI LCGYLKLTPMFLMVMPGMI SRI LYPDEVACVAPEVCKRVCGTE 358 

Qy 315 E — ADMI LPI VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFARNI YQLS FRQNA 372 

: : : I : : I I : I : I I : I I I I I : I : : I : I I I I I 

Db 359 VGC SN I AYP RLVVKLMPN GLRGLMLAVMLAALMS S LAS I FN S S ST L FTMD I YT L — RPRA 416 

Qy 373 SDKEI VWVMRITVFVFGASATAMALLTKTVYG LWYLSSDLVYIV — IFPQLLCVLFV 427 

: I :: I I : I I : I : : I hi I : : = III 

Db 417 GEGELLLVGRLWWFIVAVSVAWLPWQAAQGGQLFDYIQSVSSYLAPPVSAVFWALFV 476 

Qy 428 KGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMV — 485 

I I I I : I I : : I : | I I : I 

Db 477 PRVNEKGAFWGLI GGLLMGLARLI P EFSFGTGSCVRP 513 

Qy 486 TSFLTNICISYLAKYLFE-SG TLP-PKLDVFDAWA-RHSEENMDKTI 530 

: I I : I I I I I I III:: I : I I I : I 
Db 514 SACPAFLCRVHYLYFAIVLFFCSGLLI I IVSLCTAPI PRKHLHRLVFSLRHSKE 567 



Qy 531 LVKNENI KLDEL 542 

: I : : I I I 
Db 568 — EREDLDADEL 577 



RESULT 8 



US-10-162-012-30 

; Sequence 30, Application US/10162012 

; Patent No. 6682597 

; GENERAL INFORMATION: 

; APPLICANT: Curtis, Rory A.J. 

; APPLICANT: Silos-Santiago, Inmaculada 

; APPLICANT: Gu, Wei 

; TITLE OF INVENTION: NOVEL HUMAN ION CHANNEL AND TRANSPORTER FAMILY MEMBERS 

; FILE REFERENCE: 10448-190001 

; CURRENT APPLICATION NUMBER: US/10/162,012 

; CURRENT FILING DATE: 2002-06-04 

; PRIOR APPLICATION NUMBER: US 60/209,845 

; PRIOR FILING DATE: 2000-06-06 

; PRIOR APPLICATION NUMBER: US 09/875,321 

; PRIOR FILING DATE: 2001-06-06 

; PRIOR APPLICATION NUMBER: PCT/US01/18340 

; PRIOR FILING DATE: 2001-06-06 

; PRIOR APPLICATION NUMBER: US 60/209,257 

; PRIOR FILING DATE: 2000-06-05 

; PRIOR APPLICATION NUMBER: US 09/875,423 

; PRIOR FILING DATE: 2001-06-05 

; PRIOR APPLICATION NUMBER: PCT/US01/18398 

; PRIOR FILING DATE: 2001-06-05 

; PRIOR APPLICATION NUMBER: US 60/209,238 

; PRIOR FILING DATE: 2000-06-05 

; PRIOR APPLICATION NUMBER: US 09/875,363 

; PRIOR FILING DATE: 2001-06-05 

; PRIOR APPLICATION NUMBER: PCT/US01/ 182 47 

; PRIOR FILING DATE: 2001-06-05 

; PRIOR APPLICATION NUMBER: US 60/227,068 

; PRIOR FILING DATE: 2000-08-22 

; PRIOR APPLICATION NUMBER: US 09/928,530 

; PRIOR FILING DATE: 2001-08-13 

; PRIOR APPLICATION NUMBER: PCT/US01/25475 

; PRIOR FILING DATE: 2001-08-15 

; PRIOR APPLICATION NUMBER: US 60/226,770 

; PRIOR FILING DATE: 2000-08-21 

; PRIOR APPLICATION NUMBER: US 09/934,421 

; PRIOR FILING DATE: 2001-08-21 

; PRIOR APPLICATION NUMBER: PCT/US01/26096 

; PRIOR FILING DATE: 2001-08-21 

; PRIOR APPLICATION NUMBER: US 60/279,281 

; PRIOR FILING DATE: 2001-03-28 

; PRIOR APPLICATION NUMBER: US 10/109,029 

; PRIOR FILING DATE: 2002-03-28 

; PRIOR APPLICATION NUMBER: PCT/US02/ 09728 

; PRIOR FILING DATE: 2002-03-28 

PRIOR APPLICATION NUMBER: US 60/290,288 

; PRIOR FILING DATE: 2001-05-11 

; PRIOR APPLICATION NUMBER: US (not assigned) 

; PRIOR FILING DATE: 2002-05-13 

; NUMBER OF SEQ ID NOS : 48 

; SOFTWARE: FastSEQ for Windows Version 4.0 

; SEQ ID NO 30 

LENGTH: 672 

TYPE: PRT 

; ORGANISM: Homo sapiens 



US-10-162-012-30 



Query Match 9.8%; Score 292; DB 4; Length 672; 

Best Local Similarity 24.1%; Pred. No. 1.4e-19; 

Matches 147; Conservative 91; Mismatches 237; Indels 136; Gaps 22; 

Qy 8 LIAI IVFYLLI LLVGIWAAWRTKNS GSAEERSEAI IVGGRDI GLLVGGFTMTATWVGGGY 67 

: : I : : | | : : I | : I : Mil: : I I : I : : I : : I I : 

Db 26 ILVIAAYFLLVI GVGLWSMCRT-NRGTV GGYFLAGRSMVWWPVGAS LFASN I GS GH 8 0 

Qy 68 I N GT AEAVYVP G YGLAWAQ AP I G Y S L S LILGGLFFAKPMRSKGYVTMLDPFQQI YG 123 

II III II:: : : I I I I : I : I I I 
Db 81 FVGLA GT GAAS GLAVAGFEWNAL FWL LLGWL FAPVYLTAGVI TM PQYLR 130 

Qy 124 KRMGG LLFI PALMGEMFWAAAI F — SAL GAT I S VI I DVDMH I S VI I S A 169 

I I I I I : I : : : | : | | | | | : I I I 

Db 131 KRFGGRRIRLYLSVLSLFLYIFTKISVDMFSGAVFIQQALGWNI Y AS VI ALL 182 

Qy 170 L I AT L YT LVGGL Y S VAYT DWQL FC I FVGLW I S VP FAL SH PAVAD I G FT AVHAK Y 224 

I : I I : I I I :: I I I II I I I I : : I I : : : II 

Db 183 GI TMI YTVTGGLAALMYT DTVQT FVI LGGAC I LMGYAFHEVG GYSGLFDKYLGAAT 238 

Qy 225 QKPWLGTVDSSEVYSWLDSFLLL MLGGIPW QAYF 258 

: I : I : I I I : I I : I : I I I 

Db 239 SLTVSEDPAVGNISSFCYRPRPDSYHLLRHPVTGDLPWPALLLGLTIVS GWYWCSDQVIV 298 

Qy 259 QRVL S S S SAT YAQVL S FLAAFGCLVMAI P AI L I GAI GAS T DWNQ T AYGL P D P KT TE 314 

I I I : I I : : I : I : : I I : : | : i : II 

Db 299 QRCIAGKSLTHIKAGCILCGYLKLTPMFLMVMPGMISRILYPDEVACWPEVCRRVCGTE 358 

Qy 315 E — ADMI LP I VLQ YLC PVYI S FFGLGAVS AAVMS S ADS S I LS AS SMFARN I YQLS FRQNA 372 

: : : I : : I I : I : I I : I I I I I : I : : I : I I II 
Db 359 VGCSNIAYPRLVVKLMPNGLRGLMLAVMLAALMS SLAS I FNS S STLFTMDI Y-TRLRPRA 417 

Qy 373 SDKEIWVKRI-TVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQLLCV LFV 427 

I : I : : I I : I I : I : : : I : I : I : I I I I 

Db 418 GDRELLLVGRLWWFIVWSVAWLPWQAAQGGQLFDYIQAVSSYI^PPVSAVFVLALFV 477 

Qy 428 KGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMV — 485 

I I I I : I I : : I : | | : : I 

Db 478 PRVNEQGAFWGLIGGLLMGLARLIP EFSFGSGSCVQP 514 

Qy 486 TSFLTNICISYLAKYLFE-SGTLPPKLDVFDAW ARH S E ENMDKT I 530 

: I I : I I I I I I I : : I : I I I : I 
Db 515 SACPAFLCGVHYLYFAIVLFFCSGLLTLTVSLCTAPIPRKHLHRLVFSLRHSKE 568 

Qy 531 LVKNENIKLDE 541 

: I : : I I 
Db 569 — EREDLDADE 577 



RESULT 9 

US-09-543-681A-4994 

; Sequence 4994, Application US/09543681A 

; Patent No. 6605709 

; GENERAL INFORMATION: 

; APPLICANT: GARY BRETON 



; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO PROTEUS 
MIRABILIS FOR 

TITLE OF INVENTION: DIAGNOSTICS AND THERAPEUTICS 
FILE REFERENCE: 2709.1002-001 
CURRENT APPLICATION NUMBER: US/09/543 , 68 1A 
CURRENT FILING DATE: 2000-04-05 
PRIOR APPLICATION NUMBER : US 60/128,706 
PRIOR FILING DATE: 1999-04-09 
NUMBER OF SEQ ID NOS : 8344 
SEQ ID NO 4994 
LENGTH: 54 8 
TYPE: PRT 

ORGANISM: Proteus mirabilis 
US-09-543-681A-4994 

Query Match 9.3%; Score 277.5; DB 4; Length 548; 

Best Local Similarity 24.3%; Pred. No. 2.6e-18; 

Matches 144; Conservative 95; Mismatches 229; Indels 125; Gaps 23; 

Qy 4 HVEGL IAIIVFYLLILL-VGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGF 56 

III: II | : : | : : : I : I : : : I : : : : : I : : I 

Db 6 HTEGVGLSTIDYAIFALYVIIIISLGLWV SRS KDGAKKGTKDYFLAGKTLPWWAI GS 62 

Qy 57 TMTATWV GGGYINGTAEAVYVPGYGLAWAQAP I GYS LS LI LGGLFFAKPMR 107 

: : I : I I : I I I I II I : I I : : I 

Db 63 SLIAANISAEQFIGMSGSGFSIGLAIASY EWMAA LTLIIVAKYFLPIFI 111 

Qy 108 S KGYVTMLDP FQQI YGKRMGGLLFI PALMGEMFWAAA- 1 FS AL GATISVIIDV 159 

I I I : : : : I III : I I I I I I : I : I 

Db 112 EKGI YTI PEFVENRFKSR — NLKTILA VFWLALFI FVNLTSVLYLGSLALETILGV 165 

Qy 160 DMHI SVI I SALI ATLYTLVGGLYSVAYTDWQLFCI FVGLWI SVP FALSH 2 09 

I : : I I I I : I : I I I I : I I : I I I I I : I : : I : I : I : 
Db 166 PMMYAIIGLALFAVIYSLYGGLSAVAWTDWQVFFLILGGLFTTVLAVSYIGGDGGIMEG 225 

Qy 210 PAVADIGFTAVHAKYQKPWLGTVDSSEVYSWLDSFLLLMLGGIPW Q 255 

I I I : I I : : : : : : I I : I I 

Db 226 LSKMTAAAPDHFKMILAKENPQFMNLPG IAVLIGGL-WVANLYYWGFNQ 273 

Qy 256 AYFQRVLS S S S AT YAQVL S FLAAFGCLVMAI PAI L - - 1 GAI GAST DWNQTAYGL P 308 

I I I : : I II III I : : I : : I I : I I III 

Db 274 YIIQRALAAKSINEAQKGLVFAAFLKLIVPILVWPGIAAFVITTDPTLMA-GLGTMAQE 332 

Qy 309 DPKTTEEADMI LPI VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFARNI YQLS F 3 68 

I : I I I : I : I I : I : : I I : : I I | : I : : : I : I I : 

Db 333 HI PT LAQADKAYPWLTQFL- P I GAKGWFAALAAAI VS S LASMLNS I AT I FTMDI YKE YI 391 

Qy 369 RQNAJSDKEIVWMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYI VIFPQLLC 423 

: I : : I I I I : : I =1 I I : I I : : I : I 

Db 392 GPKSSETRLVNVGRISAVIALIIACFIAPL LGGIDQAFQYIQEYTGLVSPGILA 445 

Qy 424 V LFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPF 479 

I | | I I I I I : I I : : : I I I I I I 

Db 446 VFLLGLFWKKTNAKGAIIGWLSIPFAL FLKLMPL GMPF 484 

Qy 480 KT LAMVT S FLTN I C I S YLAK YL FE S GT L P P KLDVFDAWARH S EENMDKT I LV 532 

I I I : I : : : I : : I I I I : I : : 



Db 



485 LDQMMYTFIFTAWIGLVSLTSTKSDDSVGAIVLTDATFKTQSGFNIASYIIM 537 



RESULT 10 

US- 09-543-68 1A-68 86 

; Sequence 6886, Application US/09543681A 

; Patent No. 6605709 

; GENERAL INFORMATION: 

; APPLICANT: GARY BRETON 

; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO PROTEUS 
MIRABILIS FOR 

; TITLE OF INVENTION: DIAGNOSTICS AND THERAPEUTICS 

; FILE REFERENCE: 2709.1002-001 

; CURRENT APPLICATION NUMBER: US/09/543 , 681A 

; CURRENT FILING DATE: 2000-04-05 

; PRIOR APPLICATION NUMBER: US 60/128,706 

; PRIOR FILING DATE: 1999-04-09 

; NUMBER OF SEQ ID NOS : 8344 

; SEQ ID NO 6886 

LENGTH: 554 

TYPE: PRT 
; ORGANISM: Proteus mirabilis 
US-09-543-681A-6886 

Query Match 9.2%; Score 274.5; DB 4; Length 554; 

Best Local Similarity 23.0%; Pred. No. 5.1e-18; 

Matches 125; Conservative 96; Mismatches 214; Indels 109; Gaps 24; 

Qy 6 EGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGL LVGGFTMTA 60 

: : I : : I I I : I I : I I : : I I : I I : I : I I I 

Db 38 QAIIMFLIFVGLTLYITYWASKRTRS RS DYYTAGGKI T GFQNGMAI AGD FMS AA 91 

Qy 61 TWVGGGYINGTAEAVYVPGY-GLAWAQAPIGYSLSLILGG L FFAKPMRS KG YVTML 115 

: : : I : II II II II : : : I | : : I : I I 

Db 92 SFL GISALVYTSGYDGLI YSIGFLIGWPIILFIIAERLRNLGRYTFA 138 

Qy 116 DPFQ-QI YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALI ATL 174 



Db 



139 DWSYRLSPKPIRTLSAIGSLVWALYLIAQMVGAGKLIELLFGLNYHIAVILVGILMVL 198 



QY 



175 YT LVGGL Y S VAYT D WQL FC I FVGLWI S VP FAL S H P AVAD I G FTAVHAK YQKPWL- 229 



Db 



I | i | ; . . • . . ♦ i • ii i • • • i i 

199 YVLFGGMLATTWVQI I KAI LLLAGATFMAVMVMK AADFNFNTLFKEAVNVHQKGFSI 255 



QY 



230 GTVDSSEVYSWLDSFLLLMLG — G I PWQAY FQ RVL S S S S AT Y 269 



Db 



II I'll I I I I I ■ I i 

256 MSPGGLV— SDPISALSLGLALMFGTAGLP— HIIMRFFTVSDAKEARKSVFYATGFIGY 311 



Qy 



270 AQVL S FLAAFGC LVMAI P AI L I GAI GASTDWNQTAYGLPDPKTTEEADMILPIVLQ 325 



Db 




Qy 



326 YLC PVYI S FFGLGAVS AAVMSSADSSILSASSMFARNI YQLSFRQ-NASDKEIVWV 380 



Db 



I -1111*1 i - • • i • • i • • i • i .... i 

354 AVGGN F F- LG FI S AVAFAT I LAWAGLTLAGAS AVS H DL YANVI KN GQADERQE LKV 4 09 



Qy 



381 MRITVFVFGASATAMALL— TKTVYGLWYLSSDLVYIVIFPQLLCVLFVKGTNTYGAVAG 438 



•Ill ■ I I • • I * " • I • ■ II * I I I I I I I I 

Db 410 SKITWILGIVAIGLGILFEKQNIAFMVGLAFSIAASCNFPIILLSMYWKGLTTRGAVIG 469 

Qy 439 YVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLA 498 

I I I : : I : I I I : I II : : I | : : | : : : 

Db 470 GWSGLIVAVT LMILGPTI-WVSILGHDTPIYPYEYP ALFSMIIAFIV 515 

Qy 499 KYLF 502 

: I I 

Db 516 SWLF 519 



RESULT 11 
US-09-657-960-3 

; Sequence 3, Application US/09657960 

; Patent No. 6649342 

; GENERAL INFORMATION: 

; APPLICANT: Mack, David 

; APPLICANT: Gish, Kurt 

; TITLE OF INVENTION: NOVEL METHODS OF DIAGNOSING BREAST CANCER, COMPOSITIONS, 
AND METHODS OF 

; TITLE OF INVENTION: SCREENING FOR BREAST CANCER MODULATORS 

; FILE REFERENCE: A-69196/D JB/ J JD 

; CURRENT APPLICATION NUMBER: US/ 09/ 657 , 960 

; CURRENT FILING DATE: 2000-09-08 

; PRIOR APPLICATION NUMBER: US 09/525,361 

; PRIOR FILING DATE: 2000-03-15 

; PRIOR APPLICATION NUMBER: US 09/453,137 

; PRIOR FILING DATE: 1999-12-02 

; PRIOR APPLICATION NUMBER: US 09/450,810 

; PRIOR FILING DATE: 1999-11-29 

; PRIOR APPLICATION NUMBER: US 09/268,8 65 

; PRIOR FILING DATE: 1999-03-15 

; PRIOR APPLICATION NUMBER: PCT/US 00/06952 

; PRIOR FILING DATE: 2000-03-15 

; NUMBER OF SEQ ID NOS : 4 

; SOFTWARE: Patent In version 3.0 

; SEQ ID NO 3 

; LENGTH: 718 

; TYPE: PRT 

; ORGANISM: Homo sapiens 
US-09-657-960-3 



Query Match 9.2%; Score 272.5; DB 4; Length 718; 

Best Local Similarity 22.3%; Pred. No. 1.2e-17; 

Matches 152; Conservative 113; Mismatches 241; Indels 177; Gaps 30; 

Qy 9 I AI I - VFYLLI LLVGI WAAWRTKN S GSAEERS EAI I VGGRDI GLLVGGFTMTATWVGGG- 66 

I I I : : : : : I : : : I : I I : : I : I : I : I I : I 

Db 10 I AI VALYFI LVMCI GFFAMWKSNRSTVS GYFLAGRSMTWVT I GAS L 55 



Qy 67 YIN GT AEAVYVP G YGLAWAQ AP I G Y S LSLILGGLFFAKPMRSKGYVTMLD 116 

: : : : : I I I : I I : : I : I I : I : t I I I I 

Db 56 FVSNIGSEHFI GLAGS GAAS GFAVGAWEFNALLLLQLLGWVFI P I YI RS - GVYTM- - 109 

Qy 117 PFQQIYGKRMGG LLFI PALMGEMFWAAAI FSALGAT I SVI I DVDMHI S 164 

: I I I I : I : I : : : I : I | : : : : : I 



Db 



110 



— PEYLSKRFGGHRIQVYFAALSLILYIFTKLSVDLYSGALF IQESLGWNLYVS 161 



Qy 165 VII SAL I AT L YT LVGG L Y S VAYT DWQ L FC I FVGLW I S VP FAL S H PAVAD I - G FT AVHAK 223 

II: : I I : I I I : I I M : I : : I I : : : I I I I : 

Db 162 VI LLI GMTALLTVTGGLVAVI YTDTLQALLMI I G ALTLMI I S IMEI GGFEEVKRR 216 

Qy 224 YQKPWLGTVDSSEVYSWLDSF LLLMLGG IPW 2 54 

I : I : I I I : : III : I I 

Db 217 YM LAS PDVTSILLT YNLSNTNSCNVS PKKEALKMLRNPTDEDVPWPGFI LGQTP 270 

Qy 255 Q AY FQ RVL S S S SAT YAQ VLSFLAAFGCLVMAI PAIL IGA 2 93 

I | | | | : : : : | : : I I | 

Db 271 ASWYWCADQVIVQRVLAAKNIAHAKGSTLMAGFLKLLPMFIIVVPGMISRILFTDDIAC 330 

Qy 294 I GASTDWNQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAV 344 

I I : : I I | : : | | I : : : I I : 

Db 331 INPEHCMLVCGSRAGCSNIAY PRLVMKLVPVGLRGLMMAVMIAAL 375 

Qy 345 MS SAD S S I L S AS SMFARN I YQL S FRQNAS DKE I VWVMRI T V- FVFGAS AT AMAL LT KT VY 4 03 

II II I I I : : I : : I : I I : : I I : I : : I I I I I : I : : : : 

Db 376 MSDLDS I FNSASTI FTLDVYKL- 1 RKSAS SRELMI VGRI FVAFMWI S I AWVP 1 1 VEMQG 434 

Qy 404 GLWYLS S DLVYI VI FPQL LCVLFVKGTNT YGAVAGYVSGLFLRITGGEPYLY 455 

III I : I : I : I I I I I : II : I I I : I 

Db 4 35 GQMYLYIQEVADYLTPPVAALFLLAIFWKRCNEQGAFYGGMAGFVLGAVRLILA FAY 4 91 

Qy 456 LQPLIFY PGYYPDDNGIYNQKFPFKTLAMVT SFLT NICISYLAKY 500 

I | | : | : : | | : : I I I I : I 

Db 4 92 RAPECDQPDNRPGFIKDIHYMYVATGLFWVTGLITVIVSLLTPPPTKEQIRTTTFWSKKN 551 

Qy 501 LFESGTLPPKLDVF D AWARH S E ENMD KT I L VKN ENIK LDELALVKPRQ 549 

I ||:: : Mill : : I : : I I I III: 

Db 552 LWKENCSPKEEPYQMQEKSILRCSENNETINHIIPNGKSEDSIKGLQPEDVNLLVTCRE 611 

Qy 550 SMTLSSTFTNKEAFLDVDSSPEG 572 

: : : I I I I : I 
Db 612 EGNPVASLGHSEAETPVDAYSNG 634 



RESULT 12 

US-09-134-001C-4744 

; Sequence 4744, Application US/09134001C 
; Patent No. 6380370 
; GENERAL INFORMATION: 

APPLICANT: Lynn Douce tte- St ainm et al 
; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
STAPHYLOCOCCUS 

; TITLE OF INVENTION: EPIDERMIDIS FOR DIAGNOSTICS AND THERAPEUTICS 
; FILE REFERENCE: GTC-007 

; CURRENT APPLICATION NUMBER: US/ 09/134 , 001C 

; CURRENT FILING DATE: 1998-08-13 

; PRIOR APPLICATION NUMBER: US 60/064,964 

; PRIOR FILING DATE: 1997-11-08 

; PRIOR APPLICATION NUMBER: US 60/055,779 

; PRIOR FILING DATE: 1997-08-14 

; NUMBER OF SEQ ID NOS : 5674 

; SEQ ID NO 4744 



LENGTH: 518 
TYPE: PRT 

; ORGANISM: Staphylococcus epidermidis 
US-09-134-001C-4744 

Query Match 8.8%; Score 262.5; DB 4; Length 518; 

Best Local Similarity 22.2%; Pred. No. 6.9e-17; 

Matches 126; Conservative 102; Mismatches 223; Indels 117; Gaps 25; 

Qy 9 IAI I VFYLLI LLVGIWAAWRTKNSGSAEERS EAI I VGGRDI GLLVGGFTMTATWVGGGYI 68 

: I I I : : : : : I : : I : : I : I : : I I I II : : I : : t I 

Db 27 VMIIVYFIILLIIGFY-GYRQATGNLSE FMLGGRSIGPYITALSAGAS DMSGWMI 80 

Qy 69 NGTAEAVYVPGYGLAWAQAP I GYS LS LI LGGL - - FFAKPMRS KGY VTMLDPFQ 119 

I : t I I t : : I i I : I I : I : I : ! I : 

Db 81 MGLPGSVYSTGLSAIW ITIGLTLGAYINYFWAPRLRVYTEIAGDAITLPDFFK 134 

Qy 120 QIYGKRMGGLLFIPALMGEMFWAAAIFSAL GAT I S VI I DVDMH I S VI I S AL I AT L YT 176 

: : I I : : I : I t : : I : : I I : I I I 

Db 135 NRLDDKKNIIKIISGLIIWFFTLYTHSGFVSGGKLFESAFGLNYHAGLLIVAIIVIFYT 194 

Qy 177 LVGGL Y S VAYT DWQL FC I FVGLWI S VP FAL S H P AVADI - G FT AVHAK YQ- K PW 228 

I I : I : I I I : : : : M I : : I : I III 

Db 195 FFGGYLAVS ITDFFQGVIMLIAM-VMVP IV ALLKLNGWDTFHDIAQMKPTNLDLFR 249 

Qy 229 LGTVDS SEVYSWLDS FLLLMLGGI PWQAYFQRVLS S S SATYAQVLS FLAAFGCLVM 284 

III : : II I : : : : I : : I II 

Db 2 50 GTTVLGIV SLFSW GLGYFGQPHIIVRFMSIKSHKLLPKARRLGISWM 296 

Qy 285 AIPAILIGAIGASTDW NQTAYGLPDPKTTEEADMILPIVLQYLCPVYI SFFGLGAV 340 

I : hllll : : I I I : I : : : | | : I I I : 

Db 297 AVG--LLGAIGVGLTGISFISERHIKLEDPET LFI VMSQI LFHPLVGGFLLAAI 348 

Qy 341 SAAVMSSADSSILSASSMFARNIYQL SFRQNASDKEIVWVMRITVFVFGASATAMAL 397 

I I : I I : I : I I I : I : I I : : : I I I : I : : I : 1=1 
Db 34 9 LAAIMSTISSQLLVTSSSLTEDFYKLIRGSDKASSHQKEFVLIGRLSVLLVAIVAITIA- 4 07 

Qy 398 LTKTVYGLWYLSSDLVYIV IFPQLLCVLFVKGTNTYGAVAGYVSGLFLRI 447 

I : : : : : I I : I I : I I I : : I I : I : I 

Db 408 WH PN DT I LN LVGNAWAG FGAAF S P LVL Y S L YW KDLT RAGAI S GMVAGAVWI 459 

Qy 448 TGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLAKYLFESGTL 507 

: : : I I : : I : I : : I : : : I : I : I I 
Db 460 VW ISWIKPLATINAFF GMYE IIPGFIVSVLITYIVSKL TK 499 

Qy 508 P P K L DVFDAWARH S E ENMD KT I LVKN E 535 

I II: I t : : I II 

Db 500 KPD DYVI ENLNKVKHWKE 518 



RESULT 13 

US-09-328-352-6371 

; Sequence 6371, Application US/09328352 

; Patent No. 6562958 

; GENERAL INFORMATION: 

; APPLICANT: Gary L. Breton et al . 



; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
ACINETOBACTER 

; TITLE OF INVENTION: BAUMANNII FOR DIAGNOSTICS AND THERAPEUTICS 
; FILE REFERENCE: GTC99-03PA 

; CURRENT APPLICATION NUMBER: US/09/328-, 352 
; CURRENT FILING DATE: 1999-06-04 
; NUMBER OF SEQ ID NOS : 8252 
; SEQ ID NO 6371 

LENGTH: 501 

TYPE: PRT 

; ORGANISM: Acinetobacter baumannii 
US-09-328-352-6371 



Query Match 8.7%; Score 259; DB 4; Length 501; 

Best Local Similarity 23.1%; Pred. No. 1.4e-16; 

Matches 126; Conservative 97; Mismatches 206; Indels 116; Gaps 22; 



wy 


1 


MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 


60 






I:: || :|: : ::|:|::| 1 : hill :| 1 : 1 
MSYFDPTLIMFMVYIVAMVLIGLFAYRATSDFSD YILGGRSLGSFVTALSAGA 


58 


Ov 


61 


TWVGGGYI NGTAEAVYVPGYGLAWAQAP I GYS LS LI LGGLFFAKPMR S KGYVTML 

: : I : 1 I : I : 1 II 1 1 : 1 1 1 : 1 • 1 : 
SDMSGWLLMGLPGAIYLSGLSEAW — I AI GL 1 1 GAWLNWLLVAGRLRVHT E I QNNALT L P 


115 


UD 


SQ 
*j _/ 


116 


wy 


116 


DP FQQI YGKRMGGLL FI P ALMGEMFWAAAI FS ALGAT I S V II DVDMH I S VI I SAL 

||:: 1 1 : : : 1 : 1 1 : 1 1 : : : : : 1 1 : 
DYFTSRFDDQKKI LRI FSAVI I LVFF — AI YCASGMVAGARLFENLFGMSYTTAIWLSAI 


170 


Db 


117 


174 


Ov 


171 


I AT L YT LVGGL YS VAYT D WQL FC I FVGLW I S VP FAL S H P AVAD I G FT AVHAK Y 

1 : 1 1 : : : : 1 1 1 III 111 : : 1 : 1 : 
ATISYVCIGGFLAISWTDTFQ AGLMI FALLLTPIVTYLAIGDTTQFVTLIET 


224 


Db 


175 


226 


Qy 


225 


QKPWLGTVDSSEVYSWLDSFLLLMLGGIPW-QAYF — Q RVL S S S S AT YAQVL S FLAAFGC 
: I : I | : : : | : | I I : | : 1 : 1 1 


281 


Db 


227 


ARPHAFNIIS DL S WAVL S S MAW GLGYFGQ PHIL VRFMAADS- 


268 


Qy 


282 


LVMAI PA ILI GAI GAS T DWNQT AYG L PDPK TTE EADMI L P I VLQY 

1 : 1 1 1 1 : 1 1 : 1 1 : II 1 : : : : : : : 
-VKSIPAARRIGMTWMILCLVGAVGAG— FFGIAYFQQHPELAGWSKNPETVFMELTKI 


326 


Db 


269 


325 


Qy 


327 


LC P VY I S FFGLGAVS AAVMS SAD SSI LS AS SMFARN I YQ L S FRQNAS DKE I VWVMRI T VF 
| : | Ihlllll: : 1 II : = 1 : 1 : 1 1 1 1 1 : 1 1 1 1 1 1 
LFNPWIVGIILAAILAAVTVISTLSCQLLVCSSALTEDLYKGFIRK^^ 


386 


Db 


326 


385 


Qy 


387 


VFGASATAMALLTKTVYGLWYLSSDLVYIVIF PQLLCVLFVKGTNTYGAV 

| : | : | : | | : : : | : I : : 1 1 1 1 1 : 
AI AVLAI VLAG — N P D S KVLGLVAYAWAG FGAAFG P L 1 1 L S L FWK RMT L E GAL 


436 


Db 


386 


436 


Qy 


437 


AGYVSGLFLRITGGEPYLYLQPLIF — YPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICI 


494 


Db 


437 


1 1 : 1 : 1 1 1 : : : II : : : | : : 1 : 
AGMI VGAVWI - - GWKNLFASTGVYEI I PGF 1 CAFI S 1 1 W 


475 


Qy 


495 


SYLAK 499 
1 ::| 




Db 


476 


SLISK 480 





RESULT 14 

US-09-252-991A-27829 

; Sequence 27829, Application US/09252991A 
; Patent No. 6551795 
; GENERAL INFORMATION: 

; APPLICANT: Marc J. Rubenfield et al . 

; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
PSEUDOMONAS 

; TITLE OF INVENTION: AERUGINOSA FOR DIAGNOSTICS AND THERAPEUTICS 
; FILE REFERENCE: 107196.136 

; CURRENT APPLICATION NUMBER: US/ 09/252 , 991A 
; CURRENT FILING DATE: 1999-02-18 
; PRIOR APPLICATION NUMBER: US 60/074,788 
; PRIOR FILING DATE: 1998-02-18 

PRIOR APPLICATION NUMBER: US 60/094,190 
; PRIOR FILING DATE: 1998-07-27 
; NUMBER OF SEQ ID NOS : 33142 
; SEQ ID NO 27829 

LENGTH: 551 

TYPE: PRT 

; ORGANISM: Pseudomonas aeruginosa 
US-09-252-991A-27829 

Query Match 8.7%; Score 257.5; DB 4; Length 551; 

Best Local Similarity 22.3%; Pred. No. 2.3e-16; 

Matches 102; Conservative 91; Mismatches 205; Indels 59; Gaps 16; 

Qy 6 EGLIAIIVFYLLILLVGI WAAWRTKN S GS AEERS EAI I VGGR- DI GLLVGGFTM 58 

: I I : : : II : : I : : I I I : : I : : : I I I I : I I I 

Db 2 8 KGARAMLLD YL IMLVYALAMLGLGWYGMR KAKSQSDFLVAGRRLGPGLYLG— TM 8 0 

Qy 59 TATWVGGGYI NGTAEAVYVPGYGLAWAQAP I G YS LS LI LGGLFFAKPMRS KGYVTMLD P F 118 

I : I I II: I I I : | | : | : | | : : : I : 

Db 81 AAWLGGASTIGTVKLGYQFGLSGLWLVFMLG — LGI IVLSLVFSRQIARLRVFTVTQVL 138 

Qy 119 QQIY GKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALIATLY 175 

: I I : : | | : : : : : | I : I : I : : : : : I : I 

Db 139 EQRYQAS SRLI GGWMVAY DLMVAVTATIAIGSVTEWFGIPRIPAILCGGGIVIVY 195 

Qy 176 TLVGGLYSVAYTDWQLFCIFVGLW-ISVPFALSHPA VAD I G FT AVHAK YQ K P 227 

: : : I I : : I : I I : : I I I : : : : I : : I II 
Db 196 SVIGGMWSLTLTDIIQFVI KTVGIFLVLLPLSIDGAGGLARMQEVLPAGFFD 247 

Qy 228 WLGTVDS S EVYSWLDS FLLLMLGGI PWQAYFQRVLSS S SAT YAQVLS FLAAFGCLVMAI P 287 

II: : : : III I : I : I I I : : I I : I I = = 

Db 248 -LGHIGLDTILTY FLLYFFGALI GQDIWQRVFTARS ETWRYAGLGAGVYCMLYGAA 303 

Qy 288 AILIGAIGASTDWNQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGL — GAVSAAVM 345 

I I I I III II I : I I II I : I : I 

Db 304 C AL I GAAARVL LPDLAVPENA — YAEITREVLAP GLRGLWAAALSAIM 350 

Qy 346 S SADS S I LSAS SMFARNI YQLS FRQNAS DKEI VWVMRI TVFVFGASATAMALLTKTVYGL 405 

I : I : I : I : : : : I I I : : : I : I : : I I I 

Db 351 STASGCLLAAATVLQEDIYARFLRPGTTSD--IRLSRCITLLMGVAMLVLACLVNDVIAA 408 



QY 



406 WYLSSDLVYIVIFPQLLCVLFVKGTNTYGAVAGYVSG 442 



• - • I • • • • I • • I I • I I • I 

Db 4 09 LSIAYNLLVGGLLVPIVGALLWRRASPQGAIASIVAG 445 



RESULT 15 

US-09-4 8 9-039A-7 541 

; Sequence 7541, Application US/09489039A 

; Patent No. 6610836 

; GENERAL INFORMATION: 

; APPLICANT: Gary Breton et . al 

; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
KLEBSIELLA 

; TITLE OF INVENTION: PNEUMONIAE FOR DIAGNOSTICS AND THERAPEUTICS 

; FILE REFERENCE: 2709.2004001 

; CURRENT APPLICATION NUMBER: US/ 09/4 8 9 , 039A 

; CURRENT FILING DATE: 2000-01-27 

; PRIOR APPLICATION NUMBER: US 60/117,747 

; PRIOR FILING DATE: 1999-01-29 

; NUMBER OF SEQ ID NOS : 14342 

; SEQ ID NO 7541 

; LENGTH: 508 

TYPE: PRT 
; ORGANISM: Klebsiella pneumoniae 
US-09-489-039A-7541 

Query Match 8.6%; Score 255; DB 4; Length 508; 

Best Local Similarity 24.8%; Pred. No. 3.6e-16; 

Matches 126; Conservative 85; Mismatches 219; Indels 78; Gaps 21 

Qy 1 MAFHVEGLIAIIVFYLLILLVGIWAAWR-TKNSGSAEERSEAIIVGGRDIGLLVGGFTMT 59 

II I : II: : : | : | : | | | I | | : I : I I I : I I : 

Db 7 MAISTPMLVTFIVYIFGMVLIG-FIAWRSTKN FDDYILGGRSLGPFVTALSAG 58 

Qy 60 ATWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMR SKGYVTM 114 

I : : I : I I : : : I : I I I : I : I : I : : I : 

Db 59 AS DMSGWLLMGLPGAIFLSGISESW — IAIGLTLGAWINWKLVAGRLRVHTEVNNNALTL 116 

Qy 115 LDP FQQI YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SV IIDVDMHISVII SA 169 

I I : : | | | | : : I : I : I I : : : I 

Db 117 PDYFTGRFEDKSRVLRIISALVILLFF — T I YCAS GI VAGARLFE S T FGMS YETALWAGA 174 

Qy 170 LIATLYTLVGGLYSVAYTDWQLFCIFVGLWISVPFALSHPAVADI GF-TAVHAKYQ 225 

: I I I I I : I : : I I I I : I : : I : I II : : I 

Db 175 AAT 1 1 YT FVGG FLAVS WT DT VQAS LMI FAL I LT PVIVIISVGGFGDSLEVIKQ 227 

Qy 226 KPWLGTVDS S EVYSWLDS FLLLMLGGI PW QAYFQRVLSSSSATYAQVLSF 275 

I :::::: I : : : I I I I I I I : I : : I 

Db 228 K S I ENI DMLKGLNFVAI I S LMG — WGLGYFGQPHI LARFMAADSHHS IVHARRI SM 281 

Qy 276 LAAFGCLVMAI PAILIGAIGASTDWNQTAYGLPDPK TTEEADMILPIVLQYLCPVY 331 

II I : : I I II : I : I : : : I I : 

Db 282 TWMILCLG GAVAVGFFG 1 AYFNNN P S LAGAVNQNAERVFI ELAQI LFN PW 331 

Qy 332 I S FFGLGAVS AAVMS SAD S S I LSAS SMFARN I YQLS FRQNAS DKE I VWVMRI TVFVFGAS 391 

I : I I : Mill: : I II : : I : I : I I I I : I I I [ : I I 

Db 332 I AG I LLSAI LAAVMSTLS CQLLVCS SAI TEDLYKAFLRKNAGQKELVIVVGRMMVTjVVALV 391 



Qy 392 ATAMA LLTKTVYGLWYLSSDLVYIVIFPQLLCVLFVKGTNTYGAVAGYVSGLFL 445 

11:1 :| I : :|:l | : : : | II : I I I I 

Db 392 AI ALAAN P EN RVLG LVS YAWAG FGAAFG P WL F S VMW S RMT RN - GALAGMVI GALT 446 

Qy 44 6 RITGGE-PYLYLQPLI FYPGYYPDDNGI 4 72 

I : : I I : I II : I I 

Db 447 VI VWKQFGWLGLYEI I — PGFVFGSIGI 472 



Search completed: September 28, 2004, 17:09:58 
Job time : 37 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



September 28, 2004, 16:59:21 ; Search time 43 Seconds 

(without alignments) 
1297.467 Million cell updates/sec 

US-10-069-541-6 
2972 

1 MAFHVEGLIAIIVFYLLILL EAFLDVDSSPEGSGTEDNLQ 58 0 

BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 283366 seqs, 96191526 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



283366 



Database 



PIR_78:* 
pirl : * 
pir2 : * 
pir3 : * 
pir4 : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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RESULT 1 
JC7502 

choline transporter - human 
C; Species: Homo sapiens (man) 

C;Date: 17-Nov-2000 #sequence_revision 17-Nov-2000 #text__change Ol-Dec-2000 
C;Accession: JC7502 

R;Apparsundaram, S.; Ferguson, S.M.; George Jr., A.L.; Blakely, R.D. 
Biochem. Biophys . Res. Commun. 276, 862-867, 2000 

A; Title: Molecular cloning of a human, hemicholinium-3-sensiti-/e choline 
transporter . 

A; Reference number: JC7502 

A; Contents: Spinal cord 

A; Accession: JC7502 

A;Molecule type: mRNA 

A; Residues: 1-58 0 <APP> 

A; Cross-references : GB:AF276871 

C; Comment: This protein, a hemicholinium-3-sensitive phosphorylated 
transmembrane protein, regulates high-affinity choline uptake, and plays the 
roles in disease states. 
C; Genetics : 



A; Gene : cht 

A;Map position: 2ql2 

C; Keywords: choline transport; spinal cord; transmembrane protein; transport 
protein 



Query 


Match 


100.0%; Score 2972; DB 2; Length 580; 




Best 


Local ; 


Similarity 100.0%; Pred. No. 9.5e-211; 




Matches 580; Conservative 0; Mismatches 0; Indels 0; Gaps 


0 


Qy 


1 


MAFHVEGLI AI I VFYLLILLVGIWAAWRTKNS GSAEERSEAI IVGGRDI GLLVGGFTMTA 


60 






i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i ■ ■ i i 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 II II 1 1 I 1 1 II 1 I | I I I 




Db 


1 


MAFHVEGLIAI I VFYLLILLVGIWAAWRTKNS GSAEERSEAI IVGGRDI GLLVGGFTMTA 


60 


Qy 


61 


TWVGGGYINGTAEAVYVPGYGLAWAQAP I GYS LSLI LGGLFFAKPMRSKGYVTMLDP FQQ 


120 






1 1 1 1 1 1 1 t 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 I I I I I | | | | 




Db 


61 


TWVGGGYINGTAEAVYVPGYGLAWAQAP I GYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 


120 


Qy 


121 


I YGKRMGGLLFI PALMGEMFWAAAI FSALGAT I SVI I DVDMHI SVI I SALI ATLYTLVGG 


180 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 




Db 


121 


I YGKRMGGLLFI PALMGEMFWAAAI FSALGAT I SVI IDVDMHI SVI I SALI ATLYTLVGG 


180 


Qy 


181 


LYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSW 


240 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 l 1 l l l i l l l l l l l r l l i i i i i i i i i i i i i i i i i i i i i i 

1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 I I || 1 1 1 




Db 


181 


LYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSW 


240 


Qy 


241 


L D S FL L LML G G I P WQAY FQ RVL S S S SAT YAQVL S FLAAF GC L VMAI P AI L I GA I GAS T D W 


300 






i i i i i i i i i i i i i i i i t i i i i i i \ i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i 
1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


241 


LDS FLLLMLGGI PWQAYFQRVLS S S S AT YAQVLS FLAAFGCLVMAI PAI L I GAI GASTDW 


300 


Qy 


301 


NQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 


360 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


301 


NQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 


360 


Qy 


361 


RNIYQLSFRQNASDKEIWVIVIRITVFVFGASATAMALLTKTWGLWYLSSDLWIVIFPQ 


420 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 




Db 


361 


RNIYQLSFRQNASDKEIVWMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 


420 


Qy 


421 


LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 


480 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


421 


LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 


480 


Qy 


481 


TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 


540 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


481 


TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAVVARHSEENMDKTILVKNENIKLD 


540 


Qy 


. 541 


ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 








1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


541 


ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 





RESULT 2 
T20037 

hypothetical protein C48D1.3 - Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 15-Oct-1999 #sequence_revision 15-Oct-1999 #text_change 15-Oct-1999 
C;Accession: T20037 
R; Burton, J. 



submitted to the EMBL Data Library, October 1996 
A;Reference number: Z19214 
A; Accession: T20 037 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: DNA 
A; Residues: 1-631 <WIL> 

A;Cross-references: EMBL:Z81049; PIDN : CAB02 847 . 1 ; GSPDB : GN00022 ; CESP:C48D1.3 

A; Experimental source: clone C48D1 

C; Genetics : 

A; Gene: CESP:C48Dl. 3 

A; Map position: 4 

A;Introns: 82/1; 120/3; 187/1; 236/3; 249/1; 358/1; 510/3; 570/2 

Query Match 45.8%; Score 1361.5; DB 2; Length 631; 

Best Local Similarity 46.8%; Pred. No. 2.5e-92; 

Matches 290; Conservative 91; Mismatches 146; Indels 93; Gaps 10 

Qy 7 GLIAI IVFYLLI LLVGIWAAWRTKNSGSAEER SEAI IVGGRDIGLLVGGFTMTATW 62 

I : : I I : I I : I I I : I I I I I : : I : I I : I : : : I I : I I I I I I I I I I I I 

Db 6 G I VAI VF F YVL I LWG I WAGRK SKSSKELES EAGAAT E EVMLAG RN I GT L VG I FTMT ATW 65 

Qy 63 VGGGYINGTAEAVTVPGYGLAWAQAPIGYSLSLILGGLFF7VKPMRSKGYVTMLDP 117 

III I I I I I I I I : I I 1 I I I : I I : : I I : : I I I III II : I I : I I I I I 

Db 66 VGGAYINGTAEALY— NGGLLGCQAPVGYAISLVMGGLLFAKKMREEGYITMLDPFQFWN 123 

Qy 118 FQQI YGKRMGGLLFI PALMGEMFWAAAI F 146 

II I I : I : I II :: : I I I : I I I I IN 
Db 124 FLELIFGRTFDNFRKLGRFLKLQTIIEILDFFQHKYGQRIGGLMWPALLGETFWTAAIL 183 

Qy 147 SALGATI SVI I DVDMHI SVI I SALIATLYTLVGGLYSVAYTDWQLFCI FVGLWI SVPFA 206 

I I I I I I : I I I :: I I : I I : I I I I II I I I : I I I M I I I I I I I I I I I I : 
Db 184 SAL GAT L SVI LG I DMNAS VT L SAC I AVF YT FT GGY YAVAYT D WQL FC I FVGLL I L GL YV 243 

Qy 207 LSHPAVADI GFTAVHAKYQKPWLGTVDS S EVYSWLDS FLLLMLGGI PWQAYFQRVLS S S S 266 

: I I I : I I I I : I II t I I I I I I I I I I : 

Db 244 QNRPN RFKET SLWI DCMLLLVFGGI PWQVYFQRVLS S KT 282 

Qy 267 AT YAQVL S FLAAFGCLVMAI PAI L I GAI GAS T DWNQTAYGL P D PKTT EEA DMIL 320 

I I I I I I : I I I : : I I I I Mill : I I I II : I I : : I : : 

Db 283 AHGAQTLSFVAGVGCILMAIPPALIGAIARNTDWRMTDYSPWNNGTKVESIPPDKRNMW 342 

Qy 321 PI VLQYLCPVYI S FFGLGAVSAAVMS SAD S S I LSAS SMFARNI YQLS FRQNAS DKEI VWV 380 

I : I III I : : : I I I I I I I I I I M I I I I I : I I I : I I I I I I : : I : I : I I : I I : : I 
Db 343 PLVFQYLTPRWVAFIGLGAVSAAVMSSADSSVLSAASMFAHNIWKLTIRPHASEKEVIIV 402 

Qy 381 MRITVFVFGASATAMALLTKTVYGLWYLSSDLVTIVIFPQLLCVLFVKGTNTYGAVAGYV 440 

III: I II II I : : : | | I I M : I I I I : : : I I I I II I : : : : I I I I : : I I I 
Db 403 MRIAIICVGIMATIMALTIQSIYGLWYLCADLVY^ILFPQLLCVVYMPRSNTYGSLAGYA 462 

Qy 441 SGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMWSFLTNICISYLAKY 500 

I I I I : I I I I : I III : I : I I I I : I I I : : I I : I : : 

Db 4 63 VGLVLRLIGGEPLVSLPAFFHYPMY TDGV— QYFPFRTTAMLSSMATIYIVSIQSEK 517 

Qy 501 LFESGTLPPKL D VFD AWARH S E ENMD KT I LVKN EN I KL D ELAL VK P RQ SMT L S S T FTN K 560 

I I : I I I I : I I II I I : I : I I : I I I 

Db 518 LFKSGRLSPEWDVMGCW NIPIDHVPLPSD-VSFAVSSETLNM 559 



Qy 561 EAFLDVDSSPEGSGTEDNLQ 580 

: I I : I I I I 

Db 560 KVECDGMQFPQ-LQTEHRLQ 578 



RESULT 3 
D75188 

proline symporter (proline permease) . PAB2354 - Pyrococcus abyssi (strain Orsay) 
C; Species: Pyrococcus abyssi 

C;Date: 20-Aug-1999 #sequence_revision 20-Aug-1999 #text_change 20-Jun-2000 

C;Accession: D75188 

R; anonymous , Genoscope 

submitted to the EMBL Data Library, July 1999 

A; Description : Pyrococcus abyssi genome sequence: insights into archaeal 

chromosome structure and evolution. 

A; Reference number: A75001 

A; Access ion: D7 518 8 

A; Status: preliminary 

A; Molecule type: DNA 

A; Residues: 1-492 <KAW> 

A;Cross-references : GB:AJ248283; GB:AL096836; NID: g5457433; PIDN : CAB48955 . 1 ; 
PID:g5457464 

A; Experimental source: strain Orsay 
C; Genetics : 

A;Gene: putP-3; PAB2354 

C; Superf amily : proline carrier protein 

Query Match 11.6%; Score 344; DB 2; Length 492; 

Best Local Similarity 24.2%; Pred. No. l.le-17; 

Matches 132; Conservative 99; Mismatches 196; Indels 118; Gaps 25; 

Qy 8 LIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGY 67 

I : I : : I : I I I : I III: I I I I I : : : : : 

Db 14 LVAFLFTLILPILVGFYAMKRTKS EEDFFVGGRAMDKITVALSAVSSGRSSWL 66 

Qy 68 I N GT AEAVYVP G YGLAWAQ AP I G Y S L S LILGGLFFAKPMRSKGYVTMLDPFQQI YG 123 

: I : II I | : | | : : : | : | : | : | | : : 

Db 67 VLGLSGMAYKMGWAVW--AAVGYIVAEMFQFVYMGIRLRKFSERFNAITVPDYFEARFR 12 4 

Qy 124 K RMGG LLFI PALMGEMFWAAAIFSALGATI SVI IDVDMHISVIISALIATL 174 

I : : : | : : : | | | | | : | : : : : : : I I I : : 

Db 125 DT S K I LRI AAS 1 1 1 1 1 FLT S YVGAQ FNAGA KTLSTALGI SI FTALMI SVLMI IV 178 

Qy 175 YT L VGGL Y S VAYT D WQL FC I FVGLW I S VP FAL S H P AVAD I G FT AVHAKYQK 226 

I : : I I : I I I I I : : : : I I : I III : I I I : I 

Db 179 YMI LGGFIAVAYNDVI RAVIMI I GLW-- LPVIAVAKVGGTEEVLKVLHALDPKLIN 233 

Qy 227 PW LGTVDS SEVYSWLDS FLLLMLG- GI PWQAY- FQRVLS S S SAT YAQVLS FLAAFGC 281 

II II I : I I I I : I : I : I : : I 

Db 234 PWAFGAGWIG FLGIGFGSPGQPHIIVRYMSIDDPNKLRVSTWGTFWN 282 

Qy 282 L VMAI P AI L I GAI GAS T DWNQ T AYGL P D P KT T — EEADMILP-IVLQYLCPVYISFFGLG 338 

: I : I I I : I I : : I I : I : I I I : I I I : : I 

Db 283 WLAWGAI FVGLAGRAI VPDVSQLPGKNAEMIYPYLSAQYFPPILYGIL-IG 333 

Qy 339 AVS AAVMS SAD S S I L S AS SMFARN I YQL S FRQNA — S DKE I VWVMRI T VFVFGAS AT AMA 396 

: I I : : I : I I I : I : I : : : I I : : : I : : I : I I I I I : I 



Db 



334 GIFAAILSTADSQLLWASTWKDLYQEVIKKGTKIDEKTALTISRVTVLWGFLAAILA 393 



Qy 397 LLTKTVYGLWYLS SDLVY- 1 VI F PQLLCVLFVKGTNTYGAVAGYVSGLFL 445 

I : : I : : : I : I II: hill : I : | | : | 

Db 394 YVAKDIIFWFVLFAWGGLGASFGPTLILSLYWKGTTKWGVLAGMIVGTIT 443 

Qy 446 RITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMWSFLTNICISYLAKYLFESG 505 

I I I I : I : I : I : I I : I : I : I : | 
Db 444 TIVW KLYLKPI TGLY-ELVP AF I FS L I AT 1 1 VSMI T K 479 

Qy 506 TLPPK 510 

I I : 

Db 480 --PPE 482 



RESULT 4 
A37226 

glucose transport protein - rabbit 

N; Alternate names: sodium/D-glucose cotransporter 

C; Species: Oryctolagus cuniculus (domestic rabbit) 

C;Date: 30-Dec-1991 #sequence_revision Ol-Mar-1996 #text_change 20-Aug-1999 
C;Accession: S00515; S15974; A37226 

R;Hediger, M.A. ; Coady, M.J.; Ikeda, T.S.; Wright, E.M. 
Nature 330, 379-381, 1987 

A;Title: Expression cloning and cDNA sequencing of the Na/glucose co- 
transporter . 

A; Reference number: S00515; MUID: 88 065856; PMID:2446136 
A;Accession: S00515 
A; Molecule type: mRNA 
A; Residues: 1-662 <HED> 

A;Cross-references: EMBL:X06419; NID:gl640; PIDN : CAA2 9727 . 1 ; PID:gl641 
R;Morrison, A.I.; Panayotova-Heiermann, M. ; Feigl, G. ; Schoelermann, B. ; Kinne, 
R.K.H. 

Biochim. Biophys . Acta 1089, 121-123, 1991 

A; Title: Sequence comparison of the s odi urn- D- glucose cotransport systems in 
rabbit renal and intestinal epithelia. 

A; Reference number: S15974; MUID : 91223090 ; PMID:2025641 
A; Access ion: S15974 
A;Molecule type: mRNA 
A; Residues: 1-662 <MOR> 

A;Cross-references: EMBL:X55355; NID:gl716; PIDN : CAA3904 0 . 1 ; PID:gl717 
R;Coady, M.J.; Pajor, A.M.; Wright, E.M. 
Am. J. Physiol. 259, C605-C610, 1990 

A;Title: Sequence homologies among intestinal and renal Na (+) /glucose 
cotransporters . 

A; Reference number: A37226; MUID: 91023017 ; PMID : 2221-040 

A; Accession: A37226 

A; Status: preliminary 

A; Molecule type: mRNA 

A; Residues: 178-662 <COA> 

A; Cross-references : GB:X06419 

A; Experimental source: renal cortex 

C; Superf amily : proline carrier protein 

Query Match 10.4%; Score 308.5; DB 2; Length 662; 

Best Local Similarity 23.4%; Pred. No. 6.6e-15; 

Matches 154; Conservative 110; Mismatches 238; Indels 155; Gaps 26; 



Qy 11 IIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGYING 70 

I : : : : I : : : I I : I I : I I I : : t I : I : : I : : I | : I 

Db 32 IVIYFLWMAVGLWAMFST-NRGTV GGFFLAGRSMVWWPIGASLFASNIGSGHFVG 86 

Qy 71 TAEAVYVPGYGLAWAQAP I GYS LSLILGGLFFAKPMRSKGYVTMLDPFQQIY-GK 12 4 

I III I I : : : : | I : I : I : I I I I : I : : | | 

Db 87 LA GTGAASGIATGGFEWNALIMVVVLGWVFVPIYIRA- GWTMPEYLQKRFGGK 139 

Qy 125 RMGGLLFI PALMGEMFW — AAAI F S AL GAT - 1 S VI I D VDMH I S VI I SAL I AT L YT LVGGL 181 

I : I I : I : : I : I I I I II I : : : I : : : : : I I : I I I I : I I I 
Db 140 RIQIYLSILSLLLYIFTKISADIFS — GAIFIQLTLGLDIYVAIIILLVITGLYTITGGL 197 

Qy 182 Y S VAYT DWQ L F C I FVG LW I S VP FAL S H P AVAD I G FT AVH AK Y Q 225 

: I I I I : I : I I I II hi II I 

Db 198 AAVI YTDTLQTAIMMVGSVILTGFAFHEVG GYEAFTEKYMRAI PSQISYGNTSIPQ 253 

Qy 226 KPWLGTVDS S EVYSWLDS FLLLMLGGI PW QAYFQRVLSSSSA 267 

I : I : : I : I I I I I I I I I : : 

Db 254 KCYTPREDAFHI FRDAITGDIPWPGLVFGMSILTLWYWCTDQVIVQRCLSAKNL 307 

Qy 2 68 T YAQVL S FLAAFGCLVMAI PAI L I GAI GAS T DWNQT AYGL P D P KTTEEADMI LP 321 

I • • • ...I. • . I a I ..I 

••■ I ■ • • • • • | • • . | . | • • | 

Db 308 SHVKAGCILCGYLKVMPMFLIVMMGMVSRILYTDKVACWPSECERYCGTRVGCTNIAFP 367 

Qy 322 IVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFARNIYQLSFRQNASDKEIVWVM 381 

: : II : I : I : : I I I I I I I : : I : I I I : I I : I I : : 

Db 368 TLWELMPNGLRGLMLSVMMASLMSSLTSIFNSASTLFTMDIY-TKIRKKASEKELMIAG 426 

Qy 382 RI-TVFVFGASATAMALLTKTVYG — LWYLSSDLVYI — VI FPQLLCVLFVKGTNTYGAV 436 

I : : I : I I : : : I I : I | : | I : I I I I I 

Db 427 RLFMLFLIGISIAWVPIVQSAQSGQLFDYIQSITSYLGPPIAAVFLLAI FWKRVNEPGAF 48 6 

Qy 437 AGYVSGLFLRI TG GEPYLYLQPLIFYPGYYPDDNGIY 473 

I I I : I II I I I I :: I 
Db 487 WGLVLGFLIGISRMITEFAYGTGSCMEPSNCPTIICGVHYLYFAIILF 534 

Qy 474 NQKFPFKTLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWA-RHSEENMDKTILV 532 

I I : I : : I I : I : : : : I : I : I 
Db 535 VISIITWWSLFTKPI PDVHLYRLCWSLRNSKE 568 

Qy 533 KNENIKLD — ELALVKPRQSMTLSSTFTNKEAF LDVDSSPEGSGTED 577 

I I I I I : : : I : I : I III |: : |: 

Db 5 69 — ERIDLDAGEEDIQEAPEEATDTEVPKKKKGFFRRAYDLFCGLDQDKGPKMTKEEE 623 



RESULT 5 
A33545 

Na+/glucose cotransporter SGLT1 - human 
C; Species: Homo sapiens (man) 

C;Date: 27-Feb-1990 #sequence_revision 27-Feb-1990 #text_change 20-Aug-1999 

C;Accession: A33545; A53804 

R;Hediger, M.A. ; Turk, E . ; Wright, E.M. 

Proc. Natl. Acad. Sci. U.S.A. 86, 5748-5752, 1989 

A; Title: Homology of the human intestinal Na ( + ) /glucose and Escherichia coli 
Na (+) /proline cotransporters . 

A;Reference number: A33545; MUID: 89345544 ; PMID:2490366 



A;Accession: A33545 
A; Molecule type: mRNA 
A; Residues: 1-664 <HED> 

A;Cross-references: GB:M24847; NID:g338054; PIDN : AAA60320 . 1 ; PID:g338055 
R;Turk, E, ; Martin, M.G.; Wright, E.M. 
J. Biol. Chem. 269, 15204-15209, 1994 

A; Title: Structure of the human Na+/glucose cotransporter gene SGLT1. 

A; Reference number: A53804; MUID : 94253082 ; PMID: 8195156 

A; Access ion : A538 04 

A; Status : preliminary 

A;Molecule type: DNA 

A; Residues: 1-45 <TUR> 

A;Note: sequence extracted from NCBI backbone (NCBIN : 147993, NCBIP : 147994 ) 
C; Genetics : 

A; Gene: GDB:SLC5A1; SGLTl 

A;Cross-references : GDB: 120375; OMIM: 182380 

A;Map position: 22ql3 . l-22ql3 . 1 

C; Superf amily: proline carrier protein 

C; Keywords: transmembrane protein; transport protein 

Query Match 10.3%; Score 306; DB 2; Length 664; 

Best Local Similarity 22.8%; Pred. No. le-14; 

Matches 148; Conservative 104; Mismatches 218; Indels 178; Gaps 30 

Qy 11 IIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGYING 70 

I :::::::: I I : I I : I I I : : I I : I : : I : : I I : I 

Db 32 IVIYFVWMAVGLWAMFST-NRGTV GGFFLAGRSMVWWPIGASLFASNIGSGHFVG 8 6 

Qy 71 TAEAVYVPGYGLAWAQAPI GYS LSLILGGLFFAKPMRSK-GYVTMLDPFQQIYGK 124 

I III II: | :: | | | | I : I I I M : I 

Db 87 LA GTGAASGIAIGGFEWNALVLVWLGWLFV — PIYIKAGWTM PEYLRK 134 

Qy 125 RMGG LL FI PALMGEMFWAAAI F SAL GAT I S VI I DVDMH I SVI I SAL I A 172 

III I I : I : : : I I I I :::::::::: : | 

Db 135 RFGGQRIQVYLSLLSLLLYIFTKISADIFSGAIF INLALGLNLYLAI FLLLAIT 188 

Qy 173 TLYTLVGGLYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQK— PWL- 229 

I I I : I I I : I I I I : I : I I I II I : I Ml I : 

Db 189 ALYTITGGLAAVIYTDTLQTVIMLVGSLILTGFAFHEVG G YDAFME K YMKAI PT I V 244 

Qy 230 GTVDSSEVYS-WLDSFLLL MLGGIPW QAYFQRVLSS 264 

I : I : III: : I : I I I III): 

Db 245 SDGNTTFQEKCYTPRADSFHIFRDPLTGDLPWPGFIFGMSILTLWYWCTDQVIVQRCLSA 304 

Qy 265 SSATYAQ VLSFLAAFGCLVMAI PAIL IGAI GASTDWNQT 303 

: : : : : : I : I : | : : | : | 

Db 305 KNMSHVKGGCILCGYLKLMPMFIMVMPGMISRILYTEKIACWPSECEKYCGTKVGCTNI 364 

Qy 304 AYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSAS SMFARNI 363 

II | : : | | : I : I : : I I I I | | | : : | : | 

Db 365 AY PTLWELMPNGLRGLMLSVMLASLMSSLTSIFNSASTLFTMDI 409 

Qy 364 YQLSFRQNASDKEIWVMRITVFV-FGASATAMALLTKTVYG — LWYLSSDLVYI--VIF 418 

I I : II : I I : : I : : I I I : : : | | : | | : | 

Db 410 Y-AKVRKRASEKELMIAGRLFILVLIGISIAWVPIVQSAQSGQLFDYIQSITSYLGPPIA 468 

Qy 419 PQLLCVLFVKGTNTYGAVAGYVSGLFLRI TG GEPYLY 455 



Db 469 AVFLLAI FWKRVNEPGAFWGLILGLLIGISRMITEFAYGTGSCMEPSNCPTIICGVHYLY 528 

Qy 456 LQPLIFYPGYYPDDNGIYNQKFPFKTIAMVTSFLTNICISYLAKYLFESGTLPPKLDVFD 515 

: : I I I : I : I I I I : I : : : 

Db 529 FAIILF AISFITIWISLLTKPI PDVHLYR 558 

Qy 516 AV — VARH S E ENMD KT I L VKN EN I KL D E LALVK P RQ SMT LSSTFTNKE 561 

: I I : I : : I I I : | : : : : : : | : 

Db 559 LCWSLRNSKEERID — LDAEEENIQ EGPKETIEIETQVPEKK 598 



RESULT 6 
A53582 

Na+/glucose cotransporter SGLT1 - rat 

C; Species: Rattus norvegicus (Norway rat) 

C;Date: 12-Apr-1995 #sequence_revision 12-Apr-1995 #text_change 20-Aug-1999 
C; Access ion: A53582 

R;Lee, W.S.; Kanai, Y. ; Wells, R.G.; Hediger, M.A. 
J. Biol. Chem. 269, 12032-12039, 1994 

A;Title: The high affinity Na (+) /glucose cotransporter. Re-evaluation of 
function and distribution of expression. 

A; Reference number: A53582; MUID : 94216314 ; PMID: 8163506 
A;Accession: A53582 
A; Status : preliminary 
A; Molecule type: mRNA 
A; Residues: 1-665 <LEE> 

A; Cross-references: GB:U03120; NID:g414571; PIDN : AAA19015 . 1 ; PID:g414572 
C; Superf amily: proline carrier protein 

Query Match 10.3%; Score 306; DB 2; Length 665; 

Best Local Similarity 23.5%; Pred. No. le-14; 

Matches 155; Conservative 105; Mismatches 242; Indels 158; Gaps 29 
Qy 11 IIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGYING 7 0 



Db 32 IVIYFVWMAVGLWAMFST-NRGTV GGFFLAGRSMVWWPIGASLFASNIGSGHFVG 8 6 



Qy 71 T AEAVYVP G YGLAWAQAP I G Y S L S LILGGLFFAKPMRSK-GYVTMLDPFQQI YGK 124 

I I I I II:: : : I I I I I : I I I I I : I 

Db 87 LA GTGAAAGI AMGGFEWNALVFVWLGWLFV — P I YI KAGWTM PEYLRK 134 



Qy 125 RMGG LL FI PALMGEMFWAAAI FS ALGAT I S VI I DVDMH I S VI I SALI A 172 

III I I : I : : : I I I | : : : : | : : : : : | | 

Db 135 RFGGKRIQI YLSVLSLLLYI FTKI SADI FSGAIF INLALGLDI YLAI FILLAIT 188 

Qy 173 TLYTLVGGLYS VAYTDWQLFCI FVGLWI SVPFALSHPAVADI GFTAVHAKYQK- - PWL- 229 

I II : I I I : I I I I : I : I I : I II hi 1 I I II 

Db 189 ALYTITGGLAAVI YTDTLQTAIMLVGSFILTGFAFREVG GYEAFMDKYMKAI PTLV 244 



Qy 230 --GTVD-SSEVYS-WLDSFLLL MLGGIPW QAYFQRVLSS 264 

I : II: III: : I : I I I MM: 

Db 245 SDGNITVKEECYTPRADSFHIFRDPITGDMPWPGLIFGLSILALWYWCTDQVIVQRCLSA 304 



Qy 265 S SAT YAQVL S FLAAFGC L VMAI P AI L I GAI GAS T DWNQT AYGL P D P KTTEEADM 318 

: :: : I : |: :: I I :: I M :: 

Db 305 KNMSHVKAGCTLCGYLKLLPMFLMVMPGMI S RI LYTDKIACVLP S ECKKYCGT PVGCTNI 364 



Qy 319 I LP I VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFARNI YQLS FRQNAS DKEI V 37 8 

I : : II : I : I : : I I I t I I I : : I : I I I : I I : I I : : 

Db 365 AYPTLWELMPNGLRGLMLSVMMASLMSSLTSIFNSASTLFTMDIY-TKIRKGASEKELM 423 

Qy 37 9 WVMRI T VFV- FGASATAMALLT KTVYG — LWYLSSDLVYI — VIFPQLLCVLFVKGTNTY 4 33 

I : : I I I : : : I I : I I : I I : I I I 

Db 424 IAGRLFILVLIGISIAWVPIVQSAQSGQLFDYIQSITSYLGPPIAAVFLLAI FCKRVNEP 483 

Qy 434 GAVAGYVSGLFLRI TG GEPYLYLQPLI FYPGYYPDDN 470 

I I I : I : I II I I I I : : I 

Db 484 GAFWGLI LGFLI GI SRMITEFAYGTGSCMEPSNCPKI ICGVHYLYFAI I LF 534 

Qy 471 GIYNQKFPFKTLlAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAV — VARHSEENMDK 528 

I : I : I I I I : I : : : : : I I : I 

Db 535 AISWTVLVISLLTKPI PDVHLYRLCWSLRNSTEERID- 572 

Qy 529 TILVKNENIKLDELALVKPRQSMTLSSTFTNKE AFLDVDSSPEGSGTED 577 

I I : : I | : : : : : | | | | | | : : | : 

Db 573 — LDAGEEE PVEE DPKDTIEIDAEAPQKEKGCFRKAYDLFCGLDQDKGPKMTKEEE 626 



RESULT 7 
A44432 

amino acid transport protein - pig 

N; Alternate names: Na+/amino acid cotransporter , SAAT1 
C; Species: Sus scrofa domestica (domestic pig) 

C;Date: 31-Dec-1993 #sequence_revision 31-Dec-1993 #text_change 20-Aug-1999 
C; Access ion: A44432 

R;Kong, C.T.; Yet, S.F.; Lever, J.E. 
J. Biol. Chem. 268, 1509-1512, 1993 

A; Title: Cloning and expression of a mammalian Na+/amino acid cotransporter with 

sequence similarity to Na+/glucose cotransporters . 

A; Reference number: A44432; MUID: 93131881; PMID: 8420925 

A; Accession : A4 4 432 

A;Molecule type: nucleic acid 

A; Residues: 1-660 <KON> 

A;Cross-references : GB:L02900; NID:gl64666; PIDN : AAC37325 . 1 ; PID:gl64667 

A; Experimental source: kidney epithelial cell line LLC-PK1 

A; Note: sequence extracted from NCBI backbone (NCBIP : 122778 ) 

C; Superf amily : proline carrier protein 

C; Keywords: amino acid transport; membrane protein 

Query Match 10.2%; Score 303.5; DB 2; Length 660; 

Best Local Similarity 23.2%; Pred. No. 1.5e-14; 

Matches 141; Conservative 103; Mismatches 230; Indels 135; Gaps 26; 
Qy 11 IIVFYLLILLVGIWAAWRTKNSGSAEERSEAI1VGGRDIGLLVGGFTMTATWVGGGYING 7 0 



Db 32 IVIYFWVMAVGLWAMLRT-NRGTV GGFFLAGRDVTWWPMGASLFASNIGSGHFVG 8 6 

Qy 71 T AEAVYVP G YGLA WAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQI Y-GKRM 126 

I I : I I I I : : I I I I : I I : : : : I I I : 

Db 87 LAGTGAASGIAIAAFEW NALLLLLVLGWFFVPI YIKAGVMTMPEYLRKRFGGKRL 141 

Qy 127 GGLLFIPAL MGEMFWAAAI FSALGAT I SVI I DVDMHI S VI I SALI ATLYTLVG 179 

I I : I : : : I I I I : : : I : : : : : I : I I : I 



Db 



142 QIYLSILSLFICVALRISSDIFSGAIF IKLALGLDLYLAIFSLLAITAI YTITG 195 



Qy 180 GLYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQK — PWLGTVD 233 

I I I I I I I : I : : I : I : | | I I : : II I : I 

Db 196 GLASVIYTDTLQTIIMLIGSFILMGFAF VEVGGYES FTEKYMNAI PTI VEGDNLTI 251 

Qy 234 SSEVYS-WLDSFLLL MLGGIPW QAYFQRVLSSSSATYAQ 271 

I : I : III: : I I I I I II I I : : : 

Db 252 SPKCYTPQGDSFHI FRDAVTGDI PWPGMI FGMTWAAWYWCTDQVIVQRCLSGKDMSHVK 311 

Qy 272 VL S FLAAFGC LVMAI P AI L I GAI GAS T DWNQTAYGL P D P KT TEE — ADMILPIVLQ 325 

: : I : :: I I : I : I II : : I : : : 

Db 312 AACIMCGYLKLLPMFLMVMPGMISRILYTEKVACWPSECVKHCGTEVGCSNYAYPLLVM 371 

Qy 326 YLCPWISFFGLGAVSAAVMSSADSSILSASSMFARNIYQLSFRQNASDKEIVWVMRITV 385 

II : I : I : : I I I I I I I : : I : : I I : I I : I I : : I : : 

Db 372 ELMPSGLRGLMLSVMLASLMSSLTSIFNSASTLFTMDLY-TKIRKQASEKELLIAGRLFI 430 

Qy 386 FVFGAS AT AMAL LT KT VYG LWYLSSDLVYI— VI FPQLLCVLFVKGTNTYGA V 436 

: : I : I : I I : I I : I I I I I : 

Db 431 I LLI VI S IVWVPLVQVAQNGQLFHYI ES I S S YLGPPIAAVFLLAI FCKRVNEQGAFWGLI 490 

Qy 437 AGYVSGL FLRITG GEPYLYLQPLI FYPGYYPDDNGI YNQKF 477 

1:111 I : I I I I I I :: I : 
Db 4 91 IGFVMGLIRMIAEFVYGTGSCLAASNCPQIICGVHYLYFALILFF 535 

Qy 478 PFKTLAMVTSFLTNICISYLAK YLFE SGTLPPKL D V F D AWARH 521 

I I : I I I I : I : : : : I : I I II 

Db 536 VSILWLAISLLTKPIPDVHLYRLCWALRNSTEERIDL-DAEEKRHEEAHDG 586 

Qy 522 -SEENMDKT 52 9 

I : I : : I 

Db 587 VDEDNPEET 595 



RESULT 8 
E83468 

probable sodium/ solute symporter PA1418 [imported] - Pseudomonas aeruginosa 
(strain PAOl) 

C; Species: Pseudomonas aeruginosa 

C;Date: 15-Sep-2000 #sequence_revision 15-Sep-2000 #text_change 31-Dec-2000 
C; Access ion: E8 34 68 

R; Stover, C.K.; Pham, X.Q.; Erwin, A.L.; Mizoguchi, S.D.; Warrener, P.; Hickey, 
M.J.; Brinkman, F.S.L.; Hufnagle, W.O.; Kowalik, D.J.; Lagrou, M. ; Garber, R.L. 
Goltry, L.; Tolentino, E.; Wes tbrook-Wadman, S.; Yuan, Y. ; Brody, L.L.; Coulter 
S.N.; Folger, K.R. ; Kas, A.; Larbig, K.; Lim, R.M. ; Smith, K.A.; Spencer, D.H.; 
Wong, G.K.S.; Wu, Z.; Paulsen, I.T.; Reizer, J.; Saier, M.H.; Hancock, R.E.W.; 
Lory, S.; Olson, M.V. 
Nature 406, 959-964, 2000 

A; Title: Complete genome sequence of Pseudomonas aeruginosa PAOl, an 
opportunistic pathogen. 

A/Reference number: A82950; MUID : 20437337 ; PMID : 10984043 
A;Accession: E83468 
A; Status : preliminary 
A; Molecule type: DNA 
A; Residues: 1-463 <STO> 



A/Cross-references: GB:AE004571; GB:AE004091; NID: g9947360 ; PIDN : AAG04 807 . 1; 

GSPDB:GN00131; PASP:PA1418 

A; Experimental source: strain PA01 

C; Genetics : 

A; Gene: PA1418 

Query Match 10.1%; Score 301; DB 2; Length 463; 

Best Local Similarity 25.1%; Pred. No. 1.5e-14; 

Matches 115; Conservative 86; Mismatches 211; Indels 46; Gaps 15; 

Qy 9 IAIIVFYLLILLVGI WAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGF TMTAT 61 

: I : : I : I I I : I I I : I : : I I I : : I II I I I I 

Db 1 MALDIFWLIYAAGMIALGWYGMR RAKTRDD-YLVAGRNLG PGFYLGTMAAT 51 

Qy 62 WVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQI 121 

: I I II III I III:: Mill: I : : : 

Db 52 VLGGASTIGTVRLGYVHGISGFWLCGAIG--LGIVGLSLFLAKPLLKLKIYTVTQVLERR 109 

Qy 122 YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI IDVDMHI SVI I SALIATLYTLVGGL 181 

t : I : : I I : I : I : : : I : : I : I I : : I I : 

Db 110 YN P AARHAS AL I MLVYALMI GAT S T I AI GT VMQVL FGL P FWVS I L I GGGVWL Y S T I GGM 169 

Qy 182 YSVAYTDWQLFCIFVGL-WISVPFALSHPAVADIGFTAVKAKYQKPWLGTVDSSEVYSW 240 

: I : II : I I : I I I : : : I : : : I : I : I I : I : : I 

Db 170 WSLTLTDIVQFLIMTVGLVFLLMPLSINDAG GWDALVAKLPASYF DFTAI-GW 221 

Qy 241 LDSFLLLMLGGIPWQAYFQRVLSSSSATYAQVLSFLAAFGCLVMAIPAILIGAIGAS 297 

: I I : I | : | | | : : | | | : | | | : : : | | I 

Db 222 DTIWYFLIYFFGIFIGQDIWQRVFTARSETVAKVAGSAAGIYCVLYGMAG7VLIGMAAKV 281 

Qy 298 TDWNQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASS 357 

III I : | : : : | | : I I I : I I : I : : I : I I : 

Db 282 L LPD LENWNAFASWEHSLPNGIRGLVIAAALAALMSTASAGLLAAST 330 

Qy 358 MFARNIY-QLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIV 416 

: : : : I : I I I II : I : I I : I : : : I : 

Db 331 TVTQDLLPRLRRGRGQSDNGDVHENRIATLLLGLWLGIALWSDVISALTVAYNLLVGG 390 

Qy 417 IFPQLLCVLFVKGTNTYGAVA GYVSGLFLRITGG 450 

: I : : : I III: I : : : I II 

Db 391 MLIPLIGAI YWKRATTAGAITSMTLGFLTVLVFMIKDG 428 



RESULT 9 
B83988 

proline transporter opuE [imported] - Bacillus halodurans (strain C-125) 
C; Species: Bacillus halodurans 

C;Date: 01-Dec-2000 #sequence_revision Ol-Dec-2000 #text_change 15-Jun-2001 
C; Accession: B8 398 8 

R;Takami, H . ; Nakasone, K. ; Takaki, Y. ; Maeno, G.; Sasaki, R. ; Masui, N. ; Fuji, 
F. ; Hirama, C; Nakamura, Y. ; Ogasawara, N . ; Kuhara, S.; Horikoshi, K. 
Nucleic Acids Res. 28, 4317-4331, 2000 

A;Title: Complete genome sequence of the alkaliphilic bacterium Bacillus 
halodurans and genomic sequence comparison with Bacillus subtilis. 
A;Reference number: A83650; MUID : 20512582 ; PMID: 11058132 
A; Accession: B8398 8 
A; Status: preliminary 



A;Molecule type: DNA 
A; Residues: 1-507 <STO> 

A;Cross-references: GB:AP001516; GB:BAO00O04; NID : gl0175192 ; PIDN : BAB06425 . 1 ; 
GSPDB:GN00137 

A; Experimental source: strain C-125 
C; Genetics : 
A; Gene: opuE 

C; Superf amily : proline carrier protein 

Query Match 10.1%; Score 299.5; DB 2; Length 507; 

Best Local Similarity 26.2%; Pred. No. 2.2e-14; 

Matches 141; Conservative 84; Mismatches 220; Indels 93; Gaps 27; 

Qy 5 VEGL- 1 AI I VFYLL- 1 LLVGI WAAWRT KN S GS AEERS EAI I VGGRD I GLLVGGFTMTATW 62 

I I I : I I : : I I : : I I : I : : : : : I : : I I : : : : : 

Db 4 VE P LAVA I L I AY L VAL LLIGLLSS-KKSS VGMT D FFIAGRNLNKWTVALSAVSSG 57 

Qy 63 VGGG YI NGTAEAVYVP G YGLAWAQAP I GYS LS L I LGGL FFAKPMRS KGY VTMLD 116 

: I II I I I I I : III I : I : I : I 

Db 58 R SAW L VL G VT GT AY AT G L D AVWAVA — GYITVEVF — L FF YVARRFRAY S EQT G S I T I P D 113 

Qy 117 PFQQIYGKR MGGLLFI PALMGEMFWAAAI FSAL GATISVIIDVDMHISVIISA 169 

: : : I I I I : I I I : I I I : : I : : : I 

Db 114 ILETRFNDKTHILRGGSAFI — I M- - FFMI AYVAS QLVAGGGAFAT SMGVS S S T GMWVTA 169 

Qy 170 L I AT L YT LVGGL Y S VAYT DWQ L FC I FVGLWI S VP FAL S H PAVAD I G FTAVHA 222 

: I | | : : | | : : | : | | | | | : I I I I I I I II : I 

Db 170 VILLAYTMLGGFHAVSKTDWQAGFMFVSLVIL PWAI IGLGGFDPLLQVMHT 222 

Qy 223 KYQKPWLGTVDSSEVYSWLDS FLLLMLG-GI PWQAY- FQRVLS S S SAT YAQVLS FLAAFG 280 

: I I : : I I : I I I : I : I : : : : : : 

Db 223 EG GGFTSPFAFGFGAVIGLLGIGFGSPGNPHILVRYMSLKNVKEMRQAALISSVW 277 

Qy 281 C LVMAI P AI L I GAI GAS T DWNQ T AYG L P D P KT T EEAD MI LP I VLQ YLC PVYI S FFGL 337 

: : I I : : I I I I I I I I : | : : I I : : I | 

Db 278 NVLMGWGAVMI GLAG RAY-FPDVSLLPNGDQEQVFLMLGSEILHPLFFGFL-L 328 

Qy 338 GAVSAAVMS SADS S I LSAS SMFARNI YQLS FRQN — ASDKEI VWVMRI TVFVFGASATAM 395 

I I I I : I I II I I : I I I I I : I I i I I : I I I : : I : I : I I I : : 
Db 329 VAVLAAIMSSADSQLLVGSSAFVRDIYQKMFRRNRKLSQKKLVRLSRLTTWFMGLSLIL 388 

Qy 396 ALLTKTVYGLWYLSSDLVYIVIF PQLLCVLFVKGTNTYGAVAGYVSGLFL 445 

II : I : I : I III : I I I : I : : I I 

Db 38 9 A- FTAQEFVFW MVLFAFGGLGACFGPALLLSFYWKGVTRQGVLWGMIAGLLT 439 

Qy 446 RITGG2PYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMVTSFLT NICISYLAK 499 

I : II I : I I : I I I I I : : I I I 

Db 44 0 VI LVKQQPQWTY-AFLPDVKELLNTYFFGITYEAVPGFIVATTITWISLFTK 4 91 



RESULT 10 
A42251 

nucleoside transport protein - rabbit 

N;Alternate names: Na+/nucleoside cotransporter, SNST1 
C; Species: Oryctolagus cuniculus (domestic rabbit) 

C;Date: 31-Dec-1993 #sequence_revision 31-Dec-1993 #text_change 20-Aug-1999 
C; Access ion: A42251 



R;Pajor, A.M.; Wright, E.M. 

J. Biol. Chem. 267, 3557-3560, 1992 

A;Title: Cloning and functional expression of a mammalian Na+ /nucleoside 

cotransporter . A member of the SGLT family. 

A; Reference number: A42251; MUID : 92156077 ; PMID: 1740408 

A; Accession: A42251 

A;Molecule type: mRNA 

A; Residues: 1-672 <PAJ> 

A;Cross-references : GB:M84020; NID:gl65550; PIDN : AAA31421 . 1 ; PID:gl65551 
A;Note: sequence extracted from NCBI backbone (NCBIN : 82253 , NCBIP: 82256) 
C; Superf amily : proline carrier protein 
C;Keywords: membrane protein; nucleoside transport 

Query Match 10.0%; Score 298; DB 2; Length 672; 

Best Local Similarity 25.0%; Pred. No. 4e-14; 

Matches 153; Conservative 89; Mismatches 232; Indels 138; Gaps 25; 

Qy 9 IAII-VFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGY 67 

I I : I : : | | : : | | : I : Mil: : I I : I : : I : : I I : 

Db 2 6 IAVIAAYFLLVI GVGLWSMCRT-NRGTV GGYFLAGRSMVWWPVGASLFASNIGSGH 80 

Qy 68 I N GT AEAVYVP G YGLAWAQAP I GY S L S LILGGLFFAKPMRSKGYVTMLDPFQQI YG 12 3 

II III II:: : : | I I I : I : I I I 
Db 81 FVGLA GT GAAN G LAVAG FEWN AL FWL L L GW L FAP VY LT AGVI TM PQYLR 13 0 

Qy 124 KRMGG LL FI P ALMGEMFWAAAI F — S ALGAT I S VI I DVDMH I S VI I S A 169 

I I I I I : I : : : I : I I I I I : I I I 

Db 131 KRFGGHRI RLYLS VL S L FL YI FT KI S VDMFS GAVFI QQALGWNI YASVIALL 182 

Qy 17 0 LIATLYTLVGGLYSVAYTDWQLFCI FVGLWI SVPFALSHPAVADIGFTAVHAKY 22 4 

I : I I : I I I :: I I I I I I I I : I : I | : : : I I 

Db 183 GITMVYTVTGGLAALMYTDTVQTFVIIAGAFILTGYAFHEVG GYSGLFDKYMGAMT 238 

Qy 225 QKPWLGTVDSSEVYSWLDSFLLL MLGGIPW QAYF 258 

: I : I : I I M : I I : I : I I I 

Db 239 SLTVSEDPAVGNISSSCYRPRPDSYHLLRDPVTGDLPWPALLLGLTIVSGWYWCSDQVIV 298 

Qy 259 Q RVL S S S SAT Y AQ VL S F LAAF G C L VMAI P AI L I GAI GAS T D WN Q T AYG L P D P KT TE 314 

I I I : : I : : I ■: I : : I I : : I I : II 

Db 299 QRCLAGRNLTHIKAGCILCGYLKLTPMFLMVMPGMISRILYPDEVACVAPEVCKRVCGTE 358 

Qy 315 E — ADMI L P I VLQYLCPVYI S FFGLGAVSAAVMS SADS S I L SAS SMFARNI YQLS FRQNA 372 

: : : I : : I I : I : I I : I I I I I : I : : I : I I I I t 
Db 359 VGCSNI AYPRLWKLMPNGLRGLMLAVMLAALMS S LAS I FNS S STLFTMDI YTL — RPRA 416 

Qy 373 S DKE I VWVMRI TVFVFGASATAMALLT KTVYG LWYLSSDLVYIV — IFPQLLCVLFV 427 

: I : : I I : I I : I : : I I : I I : : : I I I 

Db 417 GEGELLLVGRLWWFIVAVSVAWLPWQAAQGGQLFDYIQSVSSYLAPPVSAVFWALFV 47 6 

Qy 428 KGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMV-- 485 

I I I I : I I : : I : I I I : I 

Db 477 PRVNEKGAFWGLI GGLLMGLARLI P EFSFGTGSCVRP 513 

Qy 486 TSFLTNICISYLAKYLFE-SG TLP-PKLDVFDAWA-RHSEENMDKTI 530 

: I I : I I I I I I lit:: I : I I I : I 
Db 514 SACPAFLCRVHYLYFAIVLFFCSGLLIIIVSLCTAPIPRKHLHRLVFSLRHSKE 567 



Qy 531 LVKNENIKLDEL 542 

: I :: Ml 
Db 568 --EREDLDADEL 577 



RESULT 11 
S59637 

glucose transport protein SGLT1, intestinal - sheep 
N;Alternate names: Na+/glucose cotransporter SGLT1 

C; Species: Ovis orientalis aries, Ovis ammon aries (domestic sheep) 

C;Date: 10-Apr-1996 #sequence_revision 19-Apr-1996 #text_change 20-Aug-1999 

C;Accession: S59637; S48858 

R;Tarpey, P.S.; Wood, I.S.; Shirazi-Beechey, S.P.; Beechey, R.B. 
Biochem. J. 312, 293-300, 1995 

A;Title: Amino acid sequence and the cellular location of the Na (+) -dependent D- 
glucose symporters (SGLT1) in the ovine enterocyte and the parotid acinar cell. 
A;Reference number: S59637; MUID: 96077158 ; PMID:7492327 
A;Accession: S59637 

A; Status: nucleic acid sequence not shown 
A;Molecule type: mRNA 
A; Residues: 1-664 <TAR> 

A;Cross-references: EMBL:X82411; NID:g861072; PIDN : CAA57809 . 1 ; PID:g861073 
A; Experimental source: tissue type jejunal mucosa 
R;Wood, I. 

submitted to the EMBL Data Library, October 1994 
A;Reference number: S48858 
A; Accession: S4 8858 
A;Molecule type: mRNA 

A;Residues: 1-233, ' R f , 235-432 , , V I , 434-4 66, 'MR ' , 4 69-664 <WOO> 

A;Cross-references: EMBL:X82411 

C; Superf amily: proline carrier protein 

Query Match 9.9%; Score 294; DB 2; Length 664; 

Best Local Similarity 23.9%; Pred. No. 7.7e-14; 

Matches 127; Conservative 93; Mismatches 202; Indels 110; Gaps 23; 



Qy 


n 


IIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGY 

1 :::::::: 1 1 : 1 1 : 1 1 1 : : 1 1 : 1 : : I : : 1 1 : 
IVIYFVVVMAVGLWAMFST-NRGTV GGFFLAGRSMVWWPIGASLFASNIGSGHFVG 


67 


Db 


32 


86 


Qy 


68 


INGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSK-GYVTMLDPFQQIYGKRM 
: 1 1 1 : II | : : | | : | 1 : 1 1 1 II : II 
LAGTGAAAGI ATGGFEWN ALI LWLLGWVFV- - P I YI KAGWTM P EYLRKRF 


126 


Db 


87 


136 


Qy 


127 


GG LLFI PALMGEMFWAAAI F SAL GAT I SVI I DVDMHI SVI I SALIATL 

II : 1 : 1 : : : 1 1 1 | : : : : | : : : : : | | | 
GGQRI QVYLSVLSLVLYI FTKI SADI FS GAI F INLALGLDLYLAI FI LLAITAL 


174 


Db 


137 


190 


Qy 


175 


YT L VGGL Y S VAY T DWQ L F C I FVG LW I S VP FAL S H P AVAD I G FT AVHAK YQ K P W L GT VD S 
1 1 : 1 1 1 : 1 1 II : 1 : : 1 : 1 1 1 1 :: 1 II : 1 1 1 
YTITGGLAAVI YTDTLQTVIMLLGS FI LTGFAFHEVG GYSAFVTKYMNA-I PTVTS 


234 


Db 


191 


245 


Qy 


235 


SEVYS-WLDSFLLL MLGGIPW QAYFQRVLSSS 


265 


Db 


246 


II: III: : 1 : 1 1 1 II II: 
YGNTTVKKECYTPRADSFHIFRDPLKGDLPWPGLIFGLTIISLWYWCTDQVIVQRCLSAK 


305 


Qy 


266 


SAT YAQVLS FLAAFGCLVMAI PAI LI GAIGASTDWNQTAYGLPDPKTTEE AD 


317 



: : : : : : I : : : I I : I : I I : : 

Db 306 NMSHVKAGCIMCGYMKLLPMFLMVMPGMISRILFTEKVACTV— PSECEKYCGTKVGCTN 363 

Qy 318 MI L P I VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFARNI YQLS FRQNAS DKEI 377 

: I : : II : I : I : : I I I I I I I : : I : I I I : I I : M : 

Db 364 IAYPTLWELMPNGLRGLMLSVMLASLMSSLTSIFNSASTLFTMDI Y-TKIRKKASEKEL 422 

Qy 37 8 VWVMRI TVFV- FGASATAMALLTKTVYG — LWYLS S DLVYI - -VI FPQLLCVLFVKGTNT 432 

: | : : | | | : : : I I : I I : | I : I I I 

Db 423 MIAGRLFMLVLIGVSIAWVPIVQSAQSGQLFDYIQSITSYLGPPIAAVFLLAI FCKRVNE 482 

Qy 433 YGAVAGYVSGLFLRI TG GEPYLYLQPLIF 461 

I I I : I : : II I II I : : I 

Db 4 83 PGAFWGLIIGFLIGVSRMITEFAYGTGSCMEPSNCPTIICGVHYLYFAIILF 534 



RESULT 12 
A56765 

sodium- glucose cotransporter homolog - human 
C; Species: Homo sapiens (man) 

C;Date: 08-Sep-1995 #sequence_revision 08-Sep-1995 #text_change 20-Aug-1999 
C;Accession: A56765; 151890 

R;Wells, R.G.; Pajor, A.M.; Kanai, Y.; Turk, E.; Wright, E.M. ; Hediger, M.A. 
Am. J. Physiol. 263, F459-F465, 1992 

A; Title: Cloning of a human kidney cDNA with similarity to the sodium-glucose 
cotransporter . 

A; Reference number: A56765; MUID : 93035768 ; PMID: 1415574 
A; Accession: A567 65 
A; Status: preliminary 
A; Molecule type: mRNA 
A; Residues: 1-672 <WEL> 

A/Cross-references: GB:M95549; NID:g338052; PIDN : AAA36608 . 1 ; PID:g338053 
A; Experimental source: kidney cortex 
C; Superf amily : proline carrier protein 
C; Keywords: transmembrane protein 

Query Match 9.8%; Score 292; DB 2; Length 672; 

Best Local Similarity 24.1%; Pred. No. l.le-13; 

Matches 147; Conservative 91; Mismatches 237; Indels 136; Gaps 22; 

Qy 8 LIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGY 67 

: : I : : | | : : | | : I : I I I I : : I I : | : : | : : | | : 

Db 26 I LVIAAYFLLVI GVGLWSMCRT-NRGTV GGYFLAGRSMVWWPVGASLFASNIGSGH 8 0 

Qy 68 I N GTAEAVYVP G YG LAWAQAP I G Y S L S LILGGLFFAKPMRSKGYVTMLDPFQQIYG 123 

II III II:: : : M I I : I : I I I 
Db 81 FVGLA GT GAAS GLAVAGFEWNAL FWLLLGWLFAP VYLTAGVI TM PQYLR 130 

Qy 124 KRMGG LLFI PALMGEMFWAAAI F — SAL GAT I S VI I DVDMH I S VI I S A 169 

II II I : I : : : I : I III I : I I I 

Db 131 KRFGGRRIRLYLSVLSLFLYIFTKISVDMFSGAVFIQQALGWNI YASVIALL 182 

Qy 170 LIATLYTLVGGLYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKY 224 

I : I I : I I I : : I I I I I I I I I : : I I : : : II 

Db 183 GI TMI YTVTGGLAALMYTDTVQTFVI LGGACI LMGYAFHEVG GYSGLFDKYLGAAT 238 



Qy 



225 



QKPWLGTVDSSEVYSWLDSFLLL MLGGIPW- 



QAYF 258 



: I : I : I II : I I : I : I I I 

Db 239 SLTVSEDPAVGNISSFCYRPRPDSYHLLRHPVTGDLPWPALLLGLTIVSGWYWCSDQVIV 2 98 

Qy 259 QRVLS S S SAT YAQVLS FLAAFGCLVMAI PAI LI GAI GASTDWNQTAYGLPDPKT TE 314 

111:11:: I : I : : I I : : I : I : I i 

Db 299 QRCLAGKSLTHIKAGCILCGYLKLTPMFLMVMPGMISRILYPDEVACWPEVCRRVCGTE 358 

Qy 315 E- - ADMI LP I VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSASSMFARNI YQLS FRQNA 372 

: : : I : : II : I : I I : I I I I I : I : : I : I I I I 

Db 359 VGCSNIAYPRLVVKLMPNGLRGLM1AVMLAALMSSLASIFNSSSTLFTMDIY-TRLRPRA 417 

Qy 373 SDKEIVWVMRI-TVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQLLCV LFV 427 

I : I : : I I : I I : I : : : I : I : I : I | | | 

Db 418 GDRELLLVGRLVm^FIVWSVAWLPWQAAQGGQLFDYIQAVSSYLAPPVSAVFVLALFV 477 

Qy 428 KGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMV— 485 

I II I : I I : : I : I I : : I 

Db 478 PRVNEQGAFWGLIGGLLMGLARLIP EFSFGSGSCVQP 514 

Qy 486 TSFLTNICISYLAKYLFE-SGTLPPKLDVFDAW ARHSEENMDKTI 530 

: I I : I I I I I I I : : I : I I I : I 
Db 515 SACPAFLCGVHYLYFAIVLFFCSGLLTLTVSLCTAPIPRKHLHRLVFSLRHSKE 568 

Qy 531 LVKNENIKLDE 541 

: I : : II 
Db 569 — EREDLDADE 577 



RESULT 13 
S59638 

glucose transport protein SGLT1 , parotid gland - sheep 
N; Alternate names: Na+/glucose cotransporter SGLT1 

C; Species: Ovis orientalis aries, Ovis ammon aries (domestic sheep) 

C;Date: 19-Mar-1997 #sequence_revision 19-Mar-1997 #text_change 07-May-1999 

C;Accession: S59638; S48857 

R;Tarpey, P.S.; Wood, I.S.; Shirazi-Beechey, S.P.; Beechey, R.B. 
Biochem. J. 312, 293-300, 1995 

A;Title: Amino acid sequence and the cellular location of the Na (+) -dependent 
glucose symporters (SGLT1) in the ovine enterocyte and the parotid acinar cell 
A; Reference number: S59637; MUID : 96077158 ; PMID:7492327 
A;Accession: S59638 

A; Status: nucleic acid sequence not shown; translation not shown 

A; Molecule type: mRNA 

A; Residues: 1-664 <TAR> 

A; Cross-references : EMBL:X82410 

A; Experimental source: clone SGLTB; tissue type parotid gland 

A;Note: the nucleotide sequence was submitted to the EMBL Data Library, Octobe 
1994 

C; Superf amily : proline carrier protein 
C; Keywords: transmembrane protein 

Query Match 9.7%; Score 288; DB 2; Length 664; 

Best Local Similarity 23.7%; Pred. No. 2.1e-13; 

Matches 126; Conservative 93; Mismatches 203; Indels 110; Gaps 23 

Qy 11 IIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGY 67 

I :::::::: I I : I : I I I : : I I : I : : I : : I I : 



Db 


32 


IVIYFVWMAVGLWHMFST-NRGTV GGFFLAGRSMVWWPIGASLFASNIGSGHFVG 


86 


Qy 


68 


INGTAEAVWPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSK-GYVTMLDPFQQIYGKRM 
: 1 1 1 : II | : : | | : | 1 : 1 1 I I I : II 
LAGT GAAAG I AT GG F EWN ALILWLLGWVFV--PI YIKAGWTM PEYLRKRF 


126 


Db 


87 


136 


Qy 


127 


GG LLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALIATL 


174 


Db 


137 


II : 1 : 1 : : : 1 1 1 1 : : : : 1 : : : : : | | I 

GGQRIQVYLSVLSLVLYIFTKISADIFSGAIF INLALGLDLYLAIFILLAITAL 


190 


Qy 


175 


YTLVGGLYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDS 
1 1 : 1 1 1 : 1 1 1 1 : 1 : : 1 : 1 II 1 : : 1 II : | I | 
YTITGGLAAVI YTDTLQTVIMLLGS FI LTGFAFHEVG GYSAFVTKYMNA- 1 PTVTS 


234 


Db 


191 


245 


Qy 


235 


SEVYS-WLDSFLLL MLGGIPW QAYFQRVLSSS 


265 


Db 


246 


II: III: : 1 : 1 1 1 Mil: 
YGNTTVKKECYTPRADSFHIFRDPLKGDLPWPGLIFGLTIISLWYWCTDQVIVQRCLSAK 


305 


Qy 


266 


SAT YAQVL S F LAAFG C LVMAI P AI L I GAI GAS T DWNQTAYGL P D P KT T E E AD 


317 


Db 


306 


: : : : : : 1 : :: I I : I : I | : : 

NMSHVKAGCIMCGYMKLLPMFLMVMPGMISRILFTEKVACTV — PSECEKYCGTKVGCTN 


363 


Qy 


318 


MI LPI VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFARNI YQLS FRQNAS DKEI 
: 1 : : II : 1 : I : : | | | 1 1 1 1 : : 1 : I I I : | | : | | : 
I AY PT L WE LMPN GL RGLML S VMLAS LMS S LT S I FN S AS T L FTMD I Y- T K I RKKAS E KEL 


377 


Db 


364 


422 


Qy 


378 


VWVMRITVFV- FGASATAMALLTKTVYG — LWYLSSDLVYI — VIFPQLLCVLFVKGTNT 


432 


Db 


423 


: 1 : : 1 1 1 : : : 1 1 : 1 I : 1 1 : 1 1 1 
MIAGRLFMLVLIGVSIAWVPIVQSAQSGQLFDYIQSITSYLGPPIAAVFLLAIFCKRVNE 


482 


Qy 


433 


YGAVAG YVS GL FLRI TG GEPYLYLQPLIF 4 61 

1 1 1 : 1 : : II 1 1 1 1 : : 1 
PGAFWGL 1 1 GFL I GVS RMI T E FAYGT GS CMEP SNC PT 1 1 CGVH YL YFAI I LF 534 




Db 


483 





RESULT 14 
H71097 

hypothetical protein PH1044 - Pyrococcus horikoshii 
C; Species: Pyrococcus horikoshii 

C;Date: 14-Aug-1998 #sequence_revision 14-Aug-1998 #text_change 20-Jun-2000 
C;Accession: H71097 

R;Kawarabayasi, Y.; Sawada, M. ; Horikawa, H.; Haikawa, Y.; Hino, Y. ; Yamamoto, 
S.; Sekine, M. ; Baba, S.; Kosugi, H. ; Hosoyama, A.; Nagai, Y. ; Sakai, M. ; Ogura, 
K.; Otsuka, R. ; Nakazawa, H.; Takamiya, M. ; Ohfuku, Y. ; Funahashi, T.; Tanaka, 
T.; Kudoh, Y. ; Yamazaki, J.; Kushida, N.; Oguchi, A.; Aoki, K. ; Yoshizawa, T.; 
Nakamura, Y. ; RobL, F.T.; Horikoshi, K. ; Masuchi, Y. ; Shizuya, H.; Kikuchi, H. 
DNA Res. 5, 55-76/ 1998 

A;Title: Complete sequence and gene organization of the genome of a hyper- 
thermophilic archaebacterium, Pyrococcus horikoshii OT3. 
A;Reference number: A71000; MUID : 98344 137 ; PMID: 9679194 
A; Accession: H71097 

A; Status: preliminary; nucleic acid sequence not shown; translation not shown 
A; Molecule type: DNA 
A;Residues: 1-491 <KAW> 

A;Cross-references: GB:AP000004; NID: g3236131 ; PIDN : BAA30142 . 1 ; PID:g3257459 
A; Experimental source: strain OT3 



A;Note: this accession replaces an interim accession for a sequence replaced by 

GenBank 

C; Genetics : 

A;Gene: PH1044 

C; Superf amily : proline carrier protein 

Query Match 9.6%; Score 286; DB 2; Length 491; 

Best Local Similarity 22.9%; Pred. No. 2.1e-13; 

Matches 125; Conservative 83; Mismatches 197; Indels 142; Gaps 20; 

Qy 8 LIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGY 67 

II::: I I : I I : I : I : I : I I I : I | : | 

Db 17 LIIVLGWVFLSLIVGVMAGIKRKFT LEGYLVSGRTLGLIFLYVLMAGEI YSAYA 70 

Qy 68 INGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFA— — KPMRSKGYVTMLDPFQQI YG 123 

II I I : : | Ml ||:| I :: I II I || ! 

Db 71 FL GT G GWAY S Y GMP I MY A 1 GYGALAYS FGYFYARYVWKAGKAFGCVTQADYFQVRYN 127 

Qy 124 KRMGGLLFI PALMGEMF WAAAI F SAL GAT I S V- - 1 1 D VDMH I S VI I SAL I AT L YT LV 178 

: I : | | : | : | : II : I : : : : I : | : | 

Db 128 SK--ALAVLVALIGIIFNIPYLQLQLQGLGYIVHVGSLGSITPKAGIVIGMIIMMIYVYT 185 

Qy 179 GGLYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDSSEVY 238 

II : : : | : : : | : I : I : I | : I I I i 
Db 186 SGLRGISWTNLLQATLMFIVAWV-VLFTIPFKQFGGIGELFKTLAQTKP 233 

Qy 239 SWLDSFLLLMLGGIPWQAYFQRVLSSSSATYAQVLSFLAAFGCLV — MAI PAILIGAIGA 296 

I : I Ml I 1 1| : I : I : I 
Db 234 DHLILHPPLGISW YVSTL-ILSGLGFFMYPQLYPSI 268 

Qy 297 STDWNQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVS 341 

II III: : : I I : :: I I : : I : I : 
Db 269 YGARDLKTLKRN YVLLPLYS I FMI PVI LAGFTVAALGI KLSAPDEAVLKAVE 320 

Qy 342 AAVMS SADS S I LSAS SMFARNI YQLS FRQNASDKEI VWVMRITV 385 

II I : I : : I | : : : : | : | : : : I I I I I : I I I I : I 
Db 321 ITYPSWVLGWGAAGFAAAASTASAILLSLAGLLSKNLYAIA-KPTASDKELVLVSRISV 379 

Qy 386 FVFGASATAMALLTKTVYGLWYLSSDLVYIVI FPQLLCVLFVKGTNTY 433 

: I I :|| I II ::: II : III I I 

Db 380 ILLGLLAMGLAL Y S P G RL VS L L L LAYAGMT QMF P G AVFGL FWK RMN K YAT G 430 

Qy 434 -GAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMVTSFLTNI 492 

1 : I I : : : I I : I II I I : | : : : : : 

Db 431 TGIIAGLITVAYLRLV LKKNPL GIH FGLWGLLVNI IVTL 469 

Qy 493 CISYLAK 499 

: : I I I 

Db 470 IVAYLTK 476 



RESULT 15 
H69670 

sodium/proline symporter opuE - Bacillus subtilis 
N;Alternate names: proline transporter opuE 
C; Species: Bacillus subtilis 

C;Date: 05-Dec-1997 #sequence_revision 05-Dec-1997 #text_change 20-Jun-2000 



C;Accession: H69670; T44450 

R;Kunst, F. ; Ogasawara, N . ; Moszer, I.; Albertini, A.M.; Alloni, G. ; Azevedo, 
V.; Bertero, M.G.; Bessieres, P.; Bolotin, A.; Borchert, S.; Boriss, R. ; 
Boursier, L. ; Brans, A.; Braun, M. ; Brignell, S.C.; Bron, S.; Brouillet, S.; 
Bruschi, C.V. ; Caldwell, B.; Capuano, v.; Carter, N.M.; Choi, S.K.; Codani, 
J.J.; Connerton, I.F.; Cummings, N.J.; Daniel, R.A. ; Denizot, F. ; Devine, K.M. ; 
Duesterhoef t, A.; Ehrlich, S.D.; Emmerson, P.T.; Entian, K.D.; Errington, J.; 
Fabret, C; Ferrari, E. 
Nature 390, 249-256, 1997 

A;Authors: Foulger, D.; Fritz, C; Fujita, M. ; Fujita, Y. ; Fuma, S.; Galizzi, 

A. ; Galleron, N . ; Ghim, S.Y.; Glaser, P.; Goffeau, A.; Golightly, E.J,; Grandi, 
G.; Guiseppi, G . ; Guy, B.J.; Haga, K. ; Haiech, J.; Harwood, C.R.; Henaut, A. ; 
Hilbert, H.; Holsappel, S.; Hosono, S.; Hullo, M.F.; Itaya, M. ; Jones, L. ; 
Joris, B.; Karamata, D.; Kasahara, Y. ; Klaerr-Blanchard, M. ; Klein, C; 
Kobayashi, Y. ; Koetter, P.; Koningstein, G. ; Krogh, S.; Kumano, M. ; Kurita, K. ; 
Lapidus, A.; Lardinois, S. 

A;Authors: Lauber, J.; Lazarevic, V.; Lee, S.M.; Levine, A.; Liu, H. ; Masuda, 
S.; Maueel, C. ; Medigue, C; Medina, N.; Mellado, R.P.; Mizuno, M. ; Moestl, D. ; 
Nakai, S.; Noback, M. ; Noone, D.; O'Reilly, M. ; Ogawa, K. ; Ogiwara, A.; Oudega, 

B. ; Park, S.H.; Parro, V.; Pohl, T.M.; Portetelle, D.; Porwolik, S.; Prescott, 
A.M.; Presecan, E. ; Pujic, P.; Purnelle, B.; Rapoport, G. ; Rey, M. ; Reynolds, 
S.; Rieger, M. ; Rivolta, C. ; Rocha, E.; Roche, B. ; Rose, M. ; Sadaie, Y.; Sato, 
T.; Scanlon, E. 

A;Authors: Schleich, S.; Schroeter, R. ; Scoff one, F. ; Sekiguchi, J.; Sekowska, 
A.; Seror, S.J.; Serror, P.; Shin, B.S.; Soldo, B.; Sorokin, A.; Tacconi, E. ; 
Takagi, T.; Takahashi, H.; Takemaru, K. ; Takeuchi, M. ; Tamakoshi, A.; Tanaka, 
T. ; Terpstra, P.; Tognoni, A.; Tosato, V.; Uchiyama, S.; Vandenbol, M. ; Vannier, 
F. ; Vassarotti, A.; Viari, A.; Wambutt, R. ; Wedler, E. ; Wedler, H.; 
Weitzenegger, T.; Winters, P.; Wipat, A.; Yamamoto, H. ; Yamane, K. ; Yasumoto, 
K.; Yata, K. ; Yoshida, K. 

A;Authors: Yoshikawa, H.F.; Zumstein, E. ; Yoshikawa, H.; Danchin, A. 

A; Title: The complete genome sequence of the Gram-positive bacterium Bacillus 

subtilis . 

A; Reference number: A69580; MUID : 98044033 ; PMID: 9384377 
A; Access ion : H69670 

A; Status: preliminary; nucleic acid sequence not shown; translation not shown 
A; Molecule type: DNA 
A; Residues: 1-492 <KUN> 

A;Cross-references : GB:Z99107; GB:AL009126; NID : g2 6328 66; PIDN : CAB12486 . 1; 
PID:g2632980 

A; Experimental source: strain 168 
R;Borriss, R. 

submitted to the EMBL Data Library, June 1997 

A; Reference number: Z22776 

A;Accession: T44450 

A; Status: preliminary 

A; Molecule type: DNA 

A; Residues: 1-492 <BOR> 

A; Cross-references : EMBL : AF011545 ; PIDN : AAB72 182 . 1 
C; Genetics : 
A; Gene: opuE 

A; Map position: 56 degree 
C; Function : 

A; Description: catalyzes the uptake of proline by a Na+-dependent transport 
mechanism 

C; Superfamily : proline carrier protein 



proline transport; sodium transport; symport system; transmembrane 



C; Keywords 
protein 

F; 4 4-71/Domain : transmembrane #s 
F; 126-145/Domain: transmembrane 
F;161-183/Domain: 
F;189-208/Domain: 
F;231-253/Domain: 
F;272-295/Domain: 
F;311-347/Domain: 
F; 366-386/Domain: transmembrane 
F; 391-4 17 /Domain : transmembrane 
F; 422-438/Domain: transmembrane 
F; 4 52-47 0/Domain: transmembrane 



transmembrane 
transmembrane 
transmembrane 
transmembrane 
transmembrane 



tatus predicted <TM1> 

#status predicted <TM2> 

#status predicted <TM3> 

ftstatus predicted <TM4> 

#status predicted <TM5> 

((status predicted <TM6> 

#status predicted <TM7> 

#status predicted <TM8> 

#status predicted <TM9> 

#status predicted <TM10> 

ftstatus predicted <TM11> 



Query Match 9.6%; Score 285; DB 2; Length 4 92; 

Best Local Similarity 22.1%; Pred. No. 2.5e-13; 

Matches 118; Conservative 97; Mismatches 214; Indels 106; 



Qy 

Db 



Gaps 18; 



5 VEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVG 64 

: I : I ::::::: I I : I : I : | : : : | | | : | | : | : 

3 IEI I ISLGI YFIAMLLIGWYAFKKTTDIND YMLGGRGLGPFVTALSAGAADMS 55 



QY 
Db 



65 GGYINGTAEAVYVPGYGLAWAQAPI GYSLSLILGGLFFAKPMRSKGYVTMLDPFQQI 121 

I : I I : : I I : II hi I : : I : I I : 

56 GWMLMGVPGAMFATGLSTLWKALGLTIGAYSNYLLLAPRLRAYTEAADDAITIPDFFDKR 115 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



122 YGKRMGGLLFI PALMGEMFWAAAI FSAL GAT I S VI I DVDMH I S VI I S ALI AT L YTLV 178 

: | : | | : : | : | : | | : : : : : | | | | 

116 FQH S S S LLKI VS ALI IMI FFT LYT S S GMVS GGRLFE SAFGAD YKLGL FLTTAVWLYTL F 175 

179 GGLYSVAYTDVVQLFCIFVGLWISVPFALSHPAVADIGFTAV7IAKYQKPWLGTVDSSEVY 2 38 

I I : I : I I I I : | | : | | : I I I I : I : : 

176 GGFLAVSLTDFVQGAIMFAAL-VLVPI VAFT — HVGGVAPTFHEIDAVNPH 223 

239 SWLDSF L L LML GG I P WQAY FQ RVL S S S SAT YAQ VLSFLA 277 

III :::::| : | | : : | : | 

224 - LLDI FKGASVI S 1 1 S YLAWGLGY YGQPHI IVRFMAIKDIKDLKPARRIG 272 

27 8 AFGCLVMAI PAILIGAIGASTDWNQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGL 337 

:: : ::| I II II :: :|| | : | : I I 

273 MSWMIITVLGSVLTGLIG VAYAHKFGVAVKDPEMIFIIFSKILFHPLITGFLL 325 

338 GAVSAAV1VISSADSSILSASSMFARNIYQLSFRQNASDKEIVWVMRITVFVFGASATAMAL 397 

I : I I : I I I I : I : I : : I : II : I I I I I : I : I : : I I I : : I 

326 SAILAAIMSSISSQLLVTASAVTEDLYRSFFRRKASDKELVMIGRLSVLVIAVIAVLLSL 385 

398 LTKTVYGLWYLSSDLVYIVIF PQLLCVLFVKGTNTYGAVAGYVSG LF 444 

: I : : : I : I : I I : I I : I I : I : | : 

386 NP NSTILDLVGYAWAGFGSAFGPAILLSLYWKRMNEWGALAAMIVGAATVL 436 

445 LRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLAK 499 

: II I I : I : I : I : I : I : I 

437 IWITTG LAKSTGVY-EIIP GFILSMIAGIIVSMITK 471 



Search completed: September 28, 2004, 17:09:18 
Job time : 4 6 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



September 28, 2004, 17:08:35 ; Search time 132 Seconds 

(without alignments) 
1412.910 Million cell updates/sec 

US-10-069-541-6 
2972 

1 MAFHVEGLI AI I VFYLLI LL EAFLDVDSSPEGSGTEDNLQ 580 

BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 1349238 seqs, 321558718 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



1349238 



Post-processing: 



Minimum Match 0% 
Maximum Match 100% 
Listing first 45 summaries 



Database 



Published__Applications_AA: * 

1 : /cgn2_6/ptodata/ l/pubpaa/US07_PUBCOMB . pep : * 

2 : /cgn2_6/ptodata/l/pubpaa/PCT_NEW_PUB.pep: * 

3: /cgn2_6/ptodata/l/pubpaa/US06_NEW_PUB.pep:* 

4: /cgn2_6/ptodata/l/pubpaa/US06_PUBCOMB.pep:* 

5: /cgn2_6/ptodata/l/pubpaa/US07_NEW_PUB.pep: * 

6 : /cgn2_6/ptodata/ 1/pubpaa/ PCTUS_PUBCOMB . pep : 

7 : /cgn2_6/ptodata/ l/pubpaa/US08_NEW_PUB . pep : * 

8 : /cgn2_6/ptodata/l/pubpaa/US08_PUBCOMB.pep: * 

9: /cgn2_6/ptodata/l/pubpaa/US09A_PUBCOMB.pep: 

10: /cgn2_6/ptodata/l/pubpaa/US09B_PUBCOMB.pep 

11: /cgn2_6/ptodata/l/pubpaa/US09C_PUBCOMB.pep 

12: /cgn2_6/ptodata/l/pubpaa/US09_NEW_PUB.pep: 

13: /cgn2_6/ptodata/l/pubpaa/US10A_PUBCOMB.pep 

14 : /cgn2_6/ptodata/l/pubpaa/US10B_PUBCOMB.pep 

15: /cgn2_6/ptodata/l/pubpaa/US10C_PUBCOMB.pep 

16: /cgn2_6/ptodata/l/pubpaa/USlO_NEW_PUB.pep: 

17: /cgn2_6/ptodata/l/pubpaa/US60_NEW_PUB.pep: 

18 : /cgn2_6/ptodata/ l/pubpaa/US60_PUBCOMB . pep : 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 



% 

Result Query 

No. Score Match Length DB 



ID 



Description 



1 


2972 


100. 


0 


580 


10 


US-09-911-077A-2 


Sequence 


2, Appli 


2 


2972 


100. 


0 


580 


10 


US-09-911-077A-10 


Sequence 


10, Appl 


3 


2972 


100. 


0 


580 


10 


US-09-911-077A-11 


Sequence 


11, Appl 


4 


2972 


100. 


0 


580 


10 


US-09-911-077A-12 


Sequence 


12, Appl 


5 


2972 


100. 


0 


580 


16 


US-10-408-765A-1145 


Sequence 


1145, Ap 


6 


2820 


94. 


9 


580 


10 


US-09-911-077A-6 


Sequence 


6, Appli 


7 


2795 


94. 


0 


580 


10 


US-09-911-077A-4 


Sequence 


4, Appli 


8 


2795 


94. 


0 


580 


10 


US-09-911-077A-24 


Sequence 


24, Appl 


9 


1506.5 


50. 


7 


610 


12 


US-10-241-784-2 


Sequence 


2, Appli 


10 


1453 


48. 


9 


576 


10 


US-09-911-077A-8 


Sequence 


8, Appli 


11 


311.5 


10. 


5 


675 


9 


US-09-733-630-2 


Sequence 


2, Appli 


12 


306 


10. 


3 


486 


14 


US-10-156-761-12818 


Sequence 


12818, A 


13 


306 


10. 


3 


664 


14 


US-10-119-988-12 


Sequence 


12, Appl 


14 


298.5 


10. 


0 


675 


9 


US-09-928-530-2 


Sequence 


2, Appli 


15 


298.5 


10. 


0 


675 


14 


US-10-162-012-27 


Sequence 


27, Appl 


16 


298.5 


10. 


0 


675 


15 


US-10-162-102-27 


Sequence 


27, Appl 


17 


297.5 


10. 


0 


471 


12 


US-10-282-122A-52725 


Sequence 


52725, A 


18 


295 


9. 


9 


596 


14 


US-10-119-988-8 


Sequence 


8, Appli 


19 


292 


9. 


8 


672 


9 


US-09-928-530-5 


Sequence 


5, Appli 


20 


292 


9. 


8 


672 


14 


US-10-162-012-30 


Sequence 


30, Appl 


21 


292 


9. 


8 


672 


15 


US-10-162-102-30 


Sequence 


30, Appl 


22 


287 


9. 


7 


678 


12 


US-10-072-012-438 


Sequence 


438, App 


23 


286 


9. 


6 


678 


12 


US-10-451-822-15 


Sequence 


15, Appl 


24 


281.5 


9. 


5 


673 


12 


US-10-072-012-440 


Sequence 


440, App 


25 


279 


9. 


4 


454 


12 


US-10-282-122A-53545 


Sequence 


53545, A 


26 


277.5 


9. 


3 


596 


9 


US-09-740-026A-2 


Sequence 


2, Appli 


27 


277.5 


9. 


3 


596 


12 


US-10-072-012-114 


Sequence 


114, App 


28 


277.5 


9. 


3 


596 


12 


US-10-169-395-124 


Sequence 


124, App 


29 


277.5 


9. 


3 


596 


12 


US-10-332-447-16 


Sequence 


16, Appl 


30 


277.5 


9. 


3 


596 


14 


US-10-237-859-2 


Sequence 


2, Appli 


31 


277.5 


9. 


3 


643 


14 


US-10-119-988-5 


Sequence 


5, Appli 


32 


277 


9. 


3 


596 


9 


US-09-740-026A-4 


Sequence 


4, Appli 


33 


277 


9. 


3 


596 


14 


US-10-237-859-4 


Sequence 


4, Appli 


34 


277 


9. 


3 


597 


12 


US-10-072-012-436 


Sequence 


436, App 


35 


273 


9. 


2 


681 


12 


US-10-451-822-26 


Sequence 


26, Appl 


36 


272.5 


9. 


2 


524 


9 


US-09-738-626-6949 


Sequence 


6949, Ap 


37 


272.5 


9. 


2 


524 


12 


US-10-627-476-496 


Sequence 


496, App 


38 


272.5 


9. 


2 


718 


12 


US-10-170-385-307 


Sequence 


307, App 


39 


270.5 


9. 


1 


718 


16 


US-10-755-889-176 


Sequence 


17 6, App 


40 


269.5 


9. 


1 


612 


12 


US-10-072-012-116 


Sequence 


116, App 


41 


269.5 


9. 


1 


664 


14 


US-10-119-988-2 


Sequence 


2, Appli 


42 


262.5 


8. 


8 


477 


12 


US-10-2 82-122A-67179 


Sequence 


67179, A 


43 


2 62 


8. 


8 


512 


15 


US-10-161-493-36 


Sequence 


36, Appl 


44 


260 


8. 


7 


552 


12 


US-10-072-012-439 


Sequence 


439, App 


45 


260 


8. 


7 


585 


12 


US-10-451-822-46 


Sequence 


46, Appl 



ALIGNMENTS 



RESULT 1 

US-09-911-077A-2 

; Sequence 2, Application US/09911077A 

; Publication No. US20030114399A1 

; GENERAL INFORMATION: 

; APPLICANT: BLAKELY, RANDY D. 



; APPLICANT: APPARSUNDARAM, SUBRAMANIAM 
; APPLICANT: FERGUSON, SHAWN 

; TITLE OF INVENTION: HUMAN AND MOUSE CHOLINE TRANSPORTER cDNA 

FILE REFERENCE: VBLT:008US 
; CURRENT APPLICATION NUMBER: US/09/911, Q77A 
; CURRENT FILING DATE: 2001-07-23 
; NUMBER OF SEQ ID NOS : 27 
; SOFTWARE: Patentln Ver. 2.1 
; SEQ ID NO 2 

LENGTH: 58 0 

TYPE: PRT 
; ORGANISM: Homo sapiens 
US-09-911-077A-2 



Query Match 100.0%; Score 2972; DB 10; Length 580; 

Best Local Similarity 100.0%; Pred. No. 1.7e-269; 

Matches 580; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 


1 


MAFHVEGLIAI IVFYLLI LLVGIWAAWRTKNSGSAEERSEAI IVGGRDI GLLVGGFTMTA 


60 




1 1 1 I 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 M 1 1 




Db 


1 


MAFHVEGLIAI IVFYLLI LLVGIWAAWRTKNSGSAEERSEAI IVGGRDI GLLVGGFTMTA 


60 


Qy 


61 


TWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 


120 




I | | | 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 II 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M 




Db 


61 


TWVGGGYI NGTAEAVYVPGYGLAWAQAP I GYS LS LI LGGLFFAKPMRS KGYVTMLD P FQQ 


120 


Qy 


121 


I YGKRMGGLLFI PALMGEMFWAAAI F SAL GAT IS VI I DVDMHI SVI I SALIATLYTLVGG 


180 




I I I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 II 1 1 1 1 1 




Db 


121 


I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALIATLYTLVGG 


180 


Qy 


181 


LYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAV1IAKYQKPWLGTVDSSEVYSW 


240 




1 I 1 1 1 M 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


181 


LYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSW 


240 


Qy 


241 


LDSFLLLMLGGI PWQAYFQRVLSS SSATYAQVLS FLAAFGCLVMAI PAILIGAIGASTDW 


300 




I I I I I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


241 


LDS FLLLMLGGI PWQAYFQRVLS S S SAT YAQVLS FLAAFGCLVMAI PAI LI GAI GASTDW 


300 


Qy 


301 


NQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 


360 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 I 1 1 1 1 1 1 1 1 1 1 1 1 I 1 I I 1 1 1 1 1 1 M 1 1 I 1 II 1 1 1 1 1 




Db 


301 


NQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSAS SMFA 


360 


Qy 


361 


RNIYQLSFRQNASDKEIWVMRITVFVFGASATAMALLTKTVTGLWYLSSDLVYIVIFPQ 


420 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II 1 




Db 


361 


RNIYQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 


420 


Qy 


421 


LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 


480 




I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


421 


LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLI FYPGYYPDDNGIYNQKFPFK 


480 


Qy 


481 


TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAVVARHSEENMDKTILVKNENIKLD 


540 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


481 


TIAMWSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 


540 


Qy 


541 


ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 








1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


541 


ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 58 0 





RESULT 2 

US-09-911-077A-10 

; Sequence 10, Application US/09911077A 

; Publication No. US20030114399A1 

; GENERAL INFORMATION: 

; APPLICANT: BLAKELY, RANDY D. 

; APPLICANT: APPARSUNDARAM, SUBRAMANIAM 

; APPLICANT: FERGUSON, SHAWN 

; TITLE OF INVENTION: HUMAN AND MOUSE CHOLINE TRANSPORTER cDNA 
; FILE REFERENCE: VBLT:008US 

; CURRENT APPLICATION NUMBER: US/09/911, 077A 

; CURRENT FILING DATE: 2001-07-23 

; NUMBER OF SEQ ID NOS : 27 

; SOFTWARE: Patent In Ver. 2.1 

; SEQ ID NO 10 

LENGTH: 580 

TYPE: PRT 
; ORGANISM: Homo sapiens 
US-09-911-077A-10 



Query Match 100.0%; Score 2972; DB 10; Length 580; 

Best Local Similarity 100.0%; Pred. No. 1.7e-269; 

Matches 580; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 


1 


MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 


60 




1 1 1 I I I 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M II 1 




Db 


1 


MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 


60 


Qy 


61 


TWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 


120 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


61 


TWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 


120 


Qy 


121 


I YGKRMGGLLFI PALMGEMFWAAAI F SAL GAT IS VI I DVDMHI SVI I SAL I AT L YT L VG G 


180 




1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 




Db 


121 


I YGKRMGGLLFI PALMGEMFWAAAI F SAL GAT I SVI I DVDMHI SVI I SALI ATLYTLVGG 


180 


Qy 


181 


LYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSW 


240 






1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 i 1 1 1 




Db 


181 


L YS VAYT DWQ L FC I FVGLWI S VP FAL S H P AVAD I G FT AVHAK YQ K P W L GT VD S S EVY S W 


240 


Qy 


241 


L D S FL L LML GGI PWQ AY FQ RVL S S S SAT YAQ VL S F LAAFGC LVMAI P AI L I GAI GAS T DW 


300 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


241 


LDS FLLLMLGGI PWQAYFQRVLS S S SAT YAQVLS FLAAFGC LVMAI PAI LI GAI GASTDW 


300 


Qy 


301 


NQTAYGLPDPKTTEEADMI LPI VLQYLCPVYI S FFGLGAVSAAVMS SADS S ILSAS SMFA 


360 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


301 


NQTAYGLPDPKTTEEADMI LPI VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFA 


360 


Qy 


361 


RN I YQLS FRQNAS DKE I VWVMRI T VFVFGAS ATAMALLT KTVYGLWYLS S DLVYI VI FPQ 


420 






1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 




Db 


361 


RNIYQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 


420 


Qy 


421 


LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 


480 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 II 1 1 1 1 1 1 1 1 1 1 I 1 1 E II 1 1 1 1 1 1 1 1 II 1 1 1 1 I 1 1 1 1 1 1 ! 1 M 




Db 


421 


LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 


480 



Qy 481 TLAMWSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 540 

I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 TLAMVTSFLTNICISYI^ 54 0 

Qy 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 

I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I II I I I I I I I 

Db 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 



RESULT 3 

US-09-911-077A-11 

; Sequence 11, Application US/09911077A 

; Publication No. US20030114399A1 

; GENERAL INFORMATION: 

; APPLICANT: BLAKELY, RANDY D. 

; APPLICANT: AP PARS UN D ARAM, SUBRAMANIAM 

; APPLICANT: FERGUSON, SHAWN 

; TITLE OF INVENTION: HUMAN AND MOUSE CHOLINE TRANSPORTER cDNA 
; FILE REFERENCE: VBLT:008US 

; CURRENT APPLICATION NUMBER: US/ 09/ 9 11 , 077A 

; CURRENT FILING DATE: 2001-07-23 

; NUMBER OF SEQ ID NOS : 27 

; SOFTWARE: Patentln Ver. 2.1 

; SEQ ID NO 11 

LENGTH: 580 

TYPE: PRT 
; ORGANISM: Homo sapiens 
US-09-911-077A-11 

Query Match 100.0%; Score 2972; DB 10; Length 580; 

Best Local Similarity 100.0%; Pred. No. 1.7e-269; 

Matches 580; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MAFH VEG L I AI I VF YL L I L LVG I WAAW RT KN S G S AE E RS EAI I VG G RD I GL L VGG FTMT A 60 

I I I I I I II I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

Qy 61 TWGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
Db 61 TWGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 

Qy 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI IDVDMHI SVI I SALIATLYTLVGG 18 0 

I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 IYGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI IDVDMHI SVI I SALIATLYTLVGG 180 

Qy 181 LYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSW 240 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I 

Db 181 L Y S VAYT DWQ L FC I FVGLWI S VP FAL S H P AVAD I G FT AVHAK YQK PWL GT VD S S EVY S W 24 0 

Qy 241 L D S F L LLMLGG I P WQAY FQ RVL S S S SAT YAQ VL S FLAAFGC LVMAI P AI L I GAI GAS T DW 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 L D S FLL LML GG I PWQAY FQ RVL S S S SAT YAQ VL S FLAAF GC LVMAI P AI L I GAI GAS T DW 300 

Qy 301 NQTAYGLPDPKTTEEADMI LPIVLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFA 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 NQTAYGLPDPKTTEEADMI LPIVLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFA 360 



Qy 


361 


Db 


361 


Qy 


421 


Db 


421 


Qy 


481 


Db 


481 


Qy 


541 


Db 


541 



I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 



LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 4 80 
I I I I I I I I I I I I I I I I I I II II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | I 
LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGI YNQKFPFK 4 80 



I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II I I I 



ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 



RESULT 4 

US-09-911-077A-12 

Sequence 12, Application US/09911077A 
Publication No. US20030114399A1 
GENERAL INFORMATION: 
APPLICANT: BLAKELY, RANDY D. 
APPLICANT: APPARSUNDARAM, SUBRAMANIAM 
APPLICANT: FERGUSON, SHAWN 

TITLE OF INVENTION: HUMAN AND MOUSE CHOLINE TRANSPORTER cDNA 
FILE REFERENCE: VBLT:008US 

CURRENT APPLICATION NUMBER: US/09/911, 077A 
CURRENT FILING DATE: 2001-07-23 
NUMBER OF SEQ ID NOS : 27 
SOFTWARE: Patentln Ver. 2.1 
SEQ ID NO 12 
LENGTH: 580 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-911-077A-12 

Query Match 100.0%; Score 2972; DB 10; Length 580; 

Best Local Similarity 100.0%; Pred. No. 1.7e-269; 

); Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I M I I I I I I II I II I I I I I 

MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 
T WVGGG YI N GT AEAVYVP G YGLAWAQAP I G YS L S L I LGGL FFAK PMRS KG YVTMLD P FQQ 120 

I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I M II I I I I I I 

TWGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 

I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALI AT LYTLVGG 180 
I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I M I I I I I I 
I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALI ATLYTLVGG 18 0 

LYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSW 24 0 
I I I I I I I M I I I I I I II I I I I I I I | I I I II I I I I I I I II I I I I I I I I I I I I I I I I II M I 
L YS VAYT DWQ L FC I FVGLW I S VP FAL S H P AVAD I G FT AVHAK YQ K P WL GT VD S S EVY S W 24 0 



Matches 


58 


Qy 


1 


Db 


1 


Qy 


61 


Db 


61 


Qy 


121 


Db 


121 


Qy 


181 


Db 


181 



Ov 
vy 


241 


Db 


241 


Ov 

^y 


301 


Db 


301 


Ov 
vy 


361 


Db 


361 


Ov 
vy 


421 


Db 


421 


Qy 


481 


Db 


481 


Qy 


541 


Db 


541 



LDSFLLLMLGGI P WQAY FQ RVL S S S SAT YAQVLS FLAAFG C LVMAI P AI L I GAI GAS T DW 300 
I I I t I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I 
LDSFLLLMLGGIPWQAYFQRVLSSS SAT YAQVLS FLAAFGCLVMAI PAIL I GAI GAS TDW 300 

NQTAYGLPDPKTTEEADMI LPI VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LS AS SMFA 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II 
NQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 360 

RNIYQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 420 

I I II I I I II I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

RNI YQLS FRQNASDKEI VWVMRITVFVFGASATAMALLTKTVYGLWYLS S DLVYI VI FPQ 42 0 

LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGI YNQKFPFK 480 

I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGI YNQKFPFK 48 0 

TLi^MVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 54 0 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 54 0 

ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 58 0 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 



RESULT 5 

US-10-4 08-7 65A-1145 

Sequence 1145, Application US/10408765A 
Publication No. US20040101874A1 
GENERAL INFORMATION: 
APPLICANT: Ghosh, Soumitra S. 
APPLICANT: Fahy, Eoin D. 
APPLICANT: Zhang, Bing 
APPLICANT: Gibson, Bradford W. 
APPLICANT: Taylor, Steven W. 
APPLICANT: Glenn, Gary M. 
APPLICANT: Warnock, Dale E. 

TITLE OF INVENTION: TARGETS FOR THERAPEUTIC INTERVENTION 
TITLE OF INVENTION: IDENTIFIED IN THE MITOCHONDRIAL PROTEOME 
FILE REFERENCE: 660088.4 65 

CURRENT APPLICATION NUMBER: US/ 10/4 08 , 765A 
CURRENT FILING DATE: 2003-04-04 
NUMBER OF SEQ ID NOS : 307 7 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 1145 
LENGTH: 580 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-4 0 8-7 65A-1145 

Query Match 100.0%; Score 2972; DB 16; Length 580; 

Best Local Similarity 100.0%; Pred. No. 1.7e-269; 

Matches 580; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 



1 MAFHVEG L I AI I VF Y L L I L LVG I WAAWRT KN S G S AE E R S EAI I VGGRD I GL LVGG FTMT A 60 
M I I I I I I I I I I I I I I I | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 



1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 



Qy 61 TWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II 1 I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 TWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 

Qy 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALI ATLYTLVGG 180 

I I I I II I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I 
Db 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALI ATLYTLVGG 180 

Qy 181 L Y S VAYT DWQL FC I FVG LW I S VP FAL S H P AVAD I G FT AVHAK YQ K P WL GT VD S S E VY S W 240 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
Db 181 LYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSW 240 

Qy 241 L D S FL LLML GG I P WQAY FQ RVL S S S SAT YAQ VL S FLAAFG C LVMAI P AI L I GAI GAS T DW 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 
Db 241 LDS FLLLMLGGI PWQAYFQRVLS S S SAT YAQ VLS FLAAFGC LVMAI PAI LI GAI GASTDW 300 

Qy 301 NQTAYGLPDPKTTEEADMI LPI VLQYLCPVYI S FFGLGAVSAAVMS SADS SI LSAS SMFA 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
Db 301 NQTAYGLPDPKTTEEADMI LPI VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFA 360 

Qy 361 RNIYQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 42 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 RNI YQLS FRQNASDKEI VWVMRITVFVFGASATAMALLTKTVYGLWYLS S DLVYI VI FPQ 420 

Qy 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGI YNQKFPFK 480 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I II 
Db 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGI YNQKFPFK 480 

Qy 481 TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 540 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 481 TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 540 

Qy 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 



RESULT 6 

US-09-911-077A-6 

; Sequence 6, Application US/09911077A 

; Publication No. US20030114399A1 

; GENERAL INFORMATION: 

; APPLICANT: BLAKELY, RANDY D. 

; APPLICANT: APPARSUNDARAM, SUBRAMANIAM 

; APPLICANT: FERGUSON, SHAWN 

; TITLE OF INVENTION: HUMAN AND MOUSE CHOLINE TRANSPORTER cDNA 
; FILE REFERENCE: VBLT:008US 

; CURRENT APPLICATION NUMBER: US/09/911, 077A 

; CURRENT FILING DATE: 2001-07-23 

; NUMBER OF SEQ ID NOS : 27 

; SOFTWARE: PatentlnVer. 2.1 

; SEQ ID NO 6 

; LENGTH: 58 0 

TYPE: PRT 
; ORGANISM: Rattus norvegicus 



US-09-911-077A-6 

Query Match 94.9%; Score 2820; DB 10; Length 580; 

Best Local Similarity 93.1%; Pred. No. 3.1e-255; 

Matches 540; Conservative 24; Mismatches 16; Indels 0; Gaps 0; 

Qy 1 MAFHVEGLI AI I VFYLLI LLVGI WAAWRTKNS GSAEERS EAI I VGGRDI GLLVGGFTMTA 60 

I I I I I I I : I I I : I I I I I I I I I I I I I : I I I I I : I I I I I I I I I I I I I I I I I I I ! I I | | I I 
Db 1 MPFHVEGLVAI I LFYLLI FLVGI WAAWKTKNS GNAEERS EAI I VGGRDI GLLVGGFTMTA 60 

Qy 61 TWVGGGYINGTAEAVYVPGYGLAWAQAP I GYSLSLI LGGLFFAKPMRS KGYVTMLDPFQQ 12 0 

I I I I I I I I I I I I I I I I II I I 1 I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 61 TWVGGGYINGTAEAVYGPGCGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 12 0 

Qy 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGAT I S VI I DVDMHI SVI I SALI ATLYTLVGG 180 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I :: I 1 I I : I I I I I I I I I I I I 
Db 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI IDVDVNI SVIVSALIAILYTLVGG 180 

Qy 181 LYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAV1LAKYQKPWLGTVDSSEVYSW 240 

I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I : : I I I I : I 
Db 181 LYSVAYTDVVQLFCIFIGLWISVPFALSHPAWDIGFTAVHAKYQSPWLGTIESVEVYTW 240 

Qy 241 L D S FL L LML GG I PWQ AYFQRVL S S S SAT YAQ VL S F LAAFGC LVMAI P AI L I GAI GAS T DW 300 

I I : I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I : I I I I II I I I I I I I 
Db 241 L DN FL L LML GG I P WQAY FQ RVL S S S SAT YAQ VL S FLAAFGC LVMAL P AI C I GAI GAS T DW 300 

Qy 301 NQTAYGLPDPKTTEEADMI LP I VLQYLCPVYI S FFGLGAVSAAVMSSADS S I LSAS SMFA 360 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 NQTAYGFPDPKTKEEADMI LP I VLQYLCPVTIS FFGLGAVSAAVMSSADS SI LSAS SMFA 360 

Qy 361 RNIYQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 420 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I M I I I I I I I I I : I I I I 
Db 361 RNIYQLSFRQNASDKEIWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIIIFPQ 420 

Qy 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 480 

M I I I I : I I I I II I I I I I I : I I I I I I I I I I I I I II I I I I I I I I I I I I 111111:1111 
Db 421 LLCVLFIKGTNTYGAVAGYIFGLFLRITGGEPYLYLQPLIFYPGYYPDKNGI YNQRFPFK 480 

Qy 481 TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 540 

I I : I I I I I M I I : I I I I I I I I I I I I I I I I I I : I I I II : I I I I I I I I I I II I : I I I I I I : 
Db 481 TLSMVTSFFTNICVSYLAKYLFESGTLPPKLDIFDAWS RHSEENMDKTILVRNENIKLN 540 

Qy 541 ELALVKPRQSMTLS STFTNKEAFLDVDS S PEGSGTEDNLQ 58 0 

III I I I I I I : I I I II I I I I I I I I II II I I I I I I I I I I I 
Db 541 ELAPVKPRQSLTLSSTFTNKEALLDVDSSPEGSGTEDNLQ 58 0 

RESULT 7 

US-09-911-077A-4 

; Sequence 4, Application US/09911077A 

; Publication No. US20030114399A1 

; GENERAL INFORMATION: 

; APPLICANT: BLAKELY, RANDY D. 

; APPLICANT: APPARSUNDARAM, SUBRAMANIAM 

; APPLICANT: FERGUSON, SHAWN 

; TITLE OF INVENTION: HUMAN AND MOUSE CHOLINE TRANSPORTER cDNA 
; FILE REFERENCE: VBLT:008US 



; CURRENT APPLICATION NUMBER: US/09/ 911, 077A 

; CURRENT FILING DATE: 2001-07-23 

; NUMBER OF SEQ ID NOS : 27 

; SOFTWARE: Patent In Ver. 2.1 

; SEQ ID NO 4 

LENGTH: 580 

TYPE: PRT 
; ORGANISM: Mus mus cuius 
US-09-911-077A-4 

Query Match 94.0%; Score 2795; DB 10; Length 580; 

Best Local Similarity 92.6%; Pred. No. 6.9e-253; 

Matches 537; Conservative 23; Mismatches 20; Indels 0; Gaps 0 

Qy 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

I I I I I I i : I I I : I I I I I I I I I I I I I : II I I I : I II I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MPFHVEGLVAIILFYLLIFLVGIWAAWKTKNSGNPEERSEAIIVGGRDIGLLVGGFTMTA 60 

Qy 61 TWGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 12 0 

I I I I I I I I I I I II II I II I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I : I 

Db 61 TWVGGGYINGTAEAVYGPGCGLAWAHAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFKQ 120 

Qy 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI S VI I DVDMHI SVI I SALI ATLYTLVGG 180 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : : I I I I : I I I I I I I I I I I I 

Db 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGAT I SVI I DVDVNI S VI VSAL I AI LYTLVGG 180 

Qy 181 LYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSW 240 

I I I I I I I I I I I II I I I : I I I I I I I I II I I I I I I I I I I I I I I II I I I II I :: I I I I : I 
Db 181 LYSVAYTDWQLFCI FI GLWI SVPFALSHPAVTDI GFTAVHAKYQS PWLGT I ESVEVYTW 24 0 

Qy 241 LDS FLLLMLGGI PWQAYFQRVLS S S SAT YAQVLS FLAAFGCLVMAI PAI LI GAI GASTDW 300 

I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I : I I I I I I I I I I I I I 
Db 241 L DN FL LLML GG I P WQ AY FQ RVL S S S SAT YAQVL S FLAAFGC L VMAL PAI C I GAI GAS T DW 300 

Qy 301 NQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 NQTAYGYPDPKTKEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 360 

Qy 361 RNIYQLSFRQNASDKEIVWMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 42 0 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I M I I I : I I I I 
Db 361 RNI YQLS FRQNAS DKE I VWVMRI TVLVFGASATAMALLTKTVYGLWYLS SDLVYI 1 1 FPQ 420 

Qy 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 480 

I I I I I I : I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I : I I I I 
Db 421 LLCVLFIKGTNTYGAVAGYIFGLFLRITGGEPYLYLQPLIFYPGYYSDKNGIYNQRFPFK 480 

Qy 481 TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHS 540 

11:11111 I I II : | | | | | | | | | | | | | | | | | | | | | | M I I II I I I I I I I I I I : I I I I I I : 
Db 481 TLSMVTSFFTNICVSYLAKYLFESGTLPPKLDVFDAVVARHSEENMDKTILVRNENIKLN 540 

Qy 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 

Ml I I I II I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 541 ELAPVKPRQSLTLSSTFTNKEALLDVDSSPEGSGTEDNLQ 580 



RESULT 8 

US-09-911-077A-24 



; Sequence 24, Application US/09911077A 

; Publication No. US20030114399A1 

; GENERAL INFORMATION: 

; APPLICANT: BLAKELY, RANDY D. 

; APPLICANT: APPARSUNDARAM, SUBRAMANIAM 

; APPLICANT: FERGUSON, SHAWN 

; TITLE OF INVENTION: HUMAN AND MOUSE CHOLINE TRANSPORTER cDNA 
; FILE REFERENCE: VBLT:008US 

; CURRENT APPLICATION NUMBER: US/ 09/911, 07 7 A 

; CURRENT FILING DATE: 2001-07-23 

; NUMBER OF SEQ ID NOS : 27 

; SOFTWARE: Patent In Ver. 2.1 

; SEQ ID NO 24 

LENGTH: 58 0 

TYPE: PRT 
; ORGANISM: Mus mus cuius 
US-09-911-077A-24 

Query Match 94.0%; Score 2795; DB 10; Length 580; 

Best Local Similarity 92.6%; Pred. No. 6.9e-253; 

Matches 537; Conservative 23; Mismatches 20; Indels 0; Gaps 0 

Qy 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

I I I I I I I : I I I : I I I I I I I I I I I I I : II I I I : I I I I I I I I I I I I I I I I I I I I I I I II 
Db 1 MPFHVEGLVAIILFYLLIFLVGIWAAWKTKNSGNPEERSEAIIVGGRDIGLLVGGFTMTA 60 

Qy 61 TWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 12 0 

I I I I I I I I I I I I I I I I II Mill I I I I I I I I I I I I I I I I I I I I M I I I II I I I I I : I 
Db 61 TWVGGGYINGTAEAVTGPGCGLAWAHAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFKQ 12 0 

Qy 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI IDVDMHI SVI I SALIATLYTLVGG 180 

I I I I I I I I I I I I I I I II i I I I I I I I I I I I I I II I I I I I I I : : I I II : I I I I I I I I I I I I 
Db 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDVNI SVI VSALIAI LYTLVGG 180 

Qy 181 LYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSW 240 

I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : : I I I I : I 
Db 181 LYSVAYTDWQLFCIFIGLWISVPFALSHPAWDIGFTAWAKYQSPWLGTIESVEVYTW 24 0 

Qy 241 L D S FL L LML GG I P WQAY FQ RVL S S S S AT YAQVL S F LAAFG C LVMAI P AI L I GAI GAS T DW 300 

I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I 
Db 241 LDNFLLLMLGGIPWQAYFQRVLSSSSATYAQVLSFLAAFGCLVMALPAICIGAI GASTDW 300 

Qy 301 NQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 360 

MINI I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
Db 301 NQTAYGYPDPKTKEEADMI LPI VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFA 360 

Qy 361 RN I YQLS FRQNAS DKE I VWVMRI TVFVFGAS ATAMALLTKTVYGLWYLS S DLVYI VI FPQ 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I II I 1 I I II : I I I I 
Db 361 RN I YQL S FRQNAS DKE I VWVMRI TVLVFGAS AT AMALLT KTVYGLW YL S S D LVY 1 1 1 F PQ 420 

Qy 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGI YNQKFPFK 480 

I I I I I I : I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I : I I I I 
Db 421 LLCVLFIKGTNTYGAVAGYIFGLFLRITGGEPYLYLQPLIFYPGYYSDKNGIYNQRFPFK 480 

Qy 481 TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAVVARHSEENMDKTILVKNENIKLD 54 0 

11:11111 I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I : 
Db 4 81 T L SMVT S FFTN I CVS YLAK YL FES GT L P P KLDVFDAWARH S E ENMDKT I LVRNEN I KLN 540 



Qy 



541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 





Db 



541 ELAPVKPRQSLTLSSTFTNKEALLDVDSSPEGSGTEDNLQ 580 



RESULT 9 



US-10-241-784-2 

; Sequence 2, Application US/10241784 
; Publication No. US2004004 82 61A1 
; GENERAL INFORMATION: 
; APPLICANT: Bayer Corporation 

; TITLE OF INVENTION: Invertebrate Choline Transporter Nucleic Acid, 

Polypeptides and Uses 

; TITLE OF INVENTION: Thereof 

; FILE REFERENCE: M07218 

; CURRENT APPLICATION NUMBER: US/10/241,784 
; CURRENT FILING DATE: 2002-09-11 
; NUMBER OF SEQ ID NOS : 2 

SOFTWARE: Patentln version 3.1 
; SEQ ID NO 2 

LENGTH: 610 

TYPE: PRT 

; ORGANISM: Drosophila melanogaster 
US-10-241-784-2 

Query Match 50.7%; Score 1506.5; DB 12; Length 610; 

Best Local Similarity 56.8%; Pred. No. 4.9e-132; 

Matches 293; Conservative 84; Mismatches 120; Indels 19; Gaps 7 

Qy 4 HVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWV 63 

: : I : : : I : : I I I I I I : I I I I I MM: I I ::: I I I I I I I I I I I I I I I 
Db 3 NIAGWSIVLFYLLILWGIWAG-RKKQSGNDSE — EEVMLAGRS I GL FVGI FTMTATWV 59 

Qy 64 GGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQIYG 123 

M I I I I I I I I I : I I I I III I I : I I I : I I : I I I I I I : I , : I I I I I | : I 

Db 60 GGGYINGTAEAIYTS — GLVWCQAPFGYALSLVFGGIFFANPMRKQGYITMLDPLQDSFG 117 

Qy 12 4 KRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALIATLYTLVGGLYS 183 

: I I I I I I I : I I I I I : I I I I I : I t I I I : I I I I I : I I I I : I : II III I I I II 
Db 118 ERMGGLLFLPALCGEVFWAAGILAALGATLSVIIDMDHRTSVILSSCIAI FYTLFGGLYS 177 

Qy 184 VAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSWLDS 243 

I I I I I I : I I I I I I : I I I :: I I I I : I : : : I : I | : : : : : | 

Db 178 VAYT DVI Q L FC I F I GLWMC I P FAW SN EHVG S L S DLEVDWI GHVE P KKHWL YI DY 231 

Qy 244 FLLLMLGGI PWQAYFQRVLS S S SATYAQVLS FLAAFGCLVMAI PAI LI GAI GASTDWNQT 303 

111:1111111111 : ||::|||| :||||| :| ||:| 

Db 232 GLLLVFGGI PWQVYFQR QNGRKGPASAYVAAAGC I LMAI P PVLI GAI AKAT PWNET 287 

Qy 304 AYGLPDPKTTEEADMI LPI VIjQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFARNI 363 

I I I I : I 1111:11111 I :: I M I I I I I I I I I I I I I I I I : I I I : I I I I I I : 
Db 288 DYKGP YPLTVDET SMI LPMVLQYLTPDFVS FFGLGAVSAAVMS SADS S VLSAASMFARNV 347 

Qy 364 YQLSFRQNASDKEIWVWRITVFVFGASATAMALLTKTWGLWYLSSDLVYIVIFPQLLC 423 

I • I I I I I I : I I : I I I I :: I | Mill : : I I I I : I I I I I : : : I I I I I 
Db 348 YKLIFRQKASEMEIIWVWRVAIIWGILATIMALTIPSIYGLWSMCSDLVYVILFPQLLM 407 



Qy 424 VL-FVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTL 4 82 

I : I I | | | | ::: | : I : I :: I I I I I I I I I I I I I | | | i : I : 

Db 408 WHFKKHCNTYGSLSAYIVALAI RLSGGEAILGLAPLIKYPGY DEETKEQMFPFRTM 464 

Qy 483 AMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAW 518 

I I : I : I I : I : I : I I I I I I I II II 
Db 465 AMLLSLVTLISVSWWTKMMFESGKLPPSYDYFRCW 500 



RESULT 10 
US-09-911-077A-8 

; Sequence 8, Application US/09911077A 

; Publication No. US20030114399A1 

; GENERAL INFORMATION: 

; APPLICANT: BLAKELY, RANDY D. 

; APPLICANT: APPARSUNDARAM, SUBRAMANIAM 

; APPLICANT: FERGUSON, SHAWN 

; TITLE OF INVENTION: HUMAN AND MOUSE CHOLINE TRANSPORTER cDNA 
; FILE REFERENCE: VBLT:008US 

; CURRENT APPLICATION NUMBER: US/ 09/ 911 , 07 7A 

; CURRENT FILING DATE: 2001-07-23 

; NUMBER OF SEQ ID NOS : 27 

; SOFTWARE: Patentln Ver. 2.1 

; SEQ ID NO 8 

LENGTH: 576 
; TYPE: PRT 

; ORGANISM: Caenorhabditis elegans 
US-09-911-077A-8 



Query Match 48.9%; Score 1453; DB 10; Length 576; 

Best Local Similarity 50.5%; Pred. No. 4.7e-127; 

Matches 295; Conservative 95; Mismatches 150; Indels 44; Gaps 9; 

Qy 7 GLI AI I VFYLLI LLVGIWAAWRTKNS GSAEER SEAI IVGGRDIGLLVGGFTMTATW 62 

I : : I I : I I : I I I : I I I I I : : I : I I : I : : : I I : I I I I I I I I I I I I 

Db 6 G I VAI VF F YVL I LWG I W AG RK SKSSKELES EAG AAT E E VMLAG RN I GT L VG I FTMT AT W 65 



Qy 63 VGGGYINGTAEAVTVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQIY 122 

I I I I I II I I II : I II I I I : II I I :: I I I I I I I I : I I : I I I I I I I I 

Db 66 VGGAYINGTAEALY — NGGLLGCQAPVGYAISLVMGGLLFAKKMREEGYITMLDPFQHKY 123 

Qy 123 GKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI S VI I SALI ATLYTLVGGL Y 182 

I : I : I I I : : : I I I : I I II III I I I I I I : I M : : I I : II Ml I I II II I 

Db 124 GQRIGGLMYVPALLGETFWTAAILSALGATLSVILGIDMNASVTLSACIAVFYTFTGGYY 183 



Qy 183 SVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDS-SEVYSWL 241 

: I I I I I I I I I II I II I I I : I I I : I II I I : I : I I : 

Db 184 AVAYTDWQLFCI FVGLWVCVPAAMVHDGAKDI SRNA GDWI GEI GGFKETSLWI 237 



Qy 242 D S FL L LML GG I P WQAY FQ RVL S S S SAT YAQVL S FLAAFGC L VMAI P AI L I G AI GAS T DWN 301 

I I I I : I I I I I I I I I I II I I : I I I I I I : I I I : : I I I I I I I I I : I I I 
Db 238 D CML L LVFG G I PWQVY FQ RVL S S KTAH GAQT L S FVAGVGC I LMAI P PAL I GAI ARN T DW R 297 



Qy 302 QTAYGLPDPKTTEEA DMI LPI VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSA 355 

II : I I : : I : : I : I I I I I : :: I I I I M I I M I I I I I I I : I I I 

Db 298 MTDYS PWNNGTKVES I PPDKRNMVVPLVFQYLT PRWVAFI GLGAVSAAVMS SADS SVLSA 357 



Qy 356 SSMFARNIYQLSFRQNASDKEIWVMRITVFVFGASATAMALLTKTWG 415 

: I | I I I I : : I : I : I I : I I : : I I I I : I M III : : : I I I I I I : I I I I : 
Db 358 ASMFAHN IWKLT I RPHASEKEVI I VMRI AI I CVGIMAT IMALT IQS I YGLWYLCADLVYV 417 

Qy 416 VIFPQLLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQ 475 

: : | | I I I I I : : : : I I II : : I I I I I I I : I I I I : I III : I : I 

Db 418 ILFPQLLCWYMPRSNTYGSLAGYAVGLVLRLIGGEPLVSLPAFFHYPMY TDGV — Q 472 

Qy 47 6 KFPFKTLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAW ARH S E ENMD KT I L V 532 

| | | : I M : : I I : I : : I I : I I I I : I I II I I : I 

Db 473 YFPFRTTAMLSSMATIYIVSIQSEKLFKSGRLSPEWDVMGCWNIPIDHVPLPSDVSFAV 532 

Qy 533 KNENIKL DELALVKPRQSMTLSSTFTN 559 

: | : : I I I : I : I I : I 

Db 533 SSETLNMKAPNGTPAPVHPNQQPSDENTLLHPYSDQSYYSTNSN 57 6 



RESULT 11 
US-09-733-630-2 

Sequence 2, Application US/09733630 
Patent No. US20020034799A1 
GENERAL INFORMATION: 
APPLICANT: Donoho, Gregory 
APPLICANT: Scoville, John 
APPLICANT: Turner, C. Alexander Jr. 
APPLICANT: Freidrich, Glenn 
APPLICANT: Zambrowicz, Brian 
APPLICANT: Abuin, Alejandro 
APPLICANT: Sands, Arthur T. 

TITLE OF INVENTION: No. US20020034799Alel Human Transporter Protein and 
TITLE OF INVENTION: Polynucleotides Encoding the Same 
FILE REFERENCE: LEX-0106-USA 
CURRENT APPLICATION NUMBER: US/ 09/733, 630 
CURRENT FILING DATE: 2000-12-08 
PRIOR APPLICATION NUMBER: US 60/170,137 
PRIOR FILING DATE: 1999-12-10 
NUMBER OF SEQ ID NOS : 3 

SOFTWARE: Fast SEQ for Windows Version 4.0 
SEQ ID NO 2 
LENGTH: 675 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-733-630-2 

Query Match 10.5%; Score 311.5; DB 9; Length 675; 

Best Local Similarity 23.0%; Pred. No. 6.4e-20; 

Matches 152; Conservative 112; Mismatches 235; Indels 161; Gaps 28 

Qy 2 AFHVEGL IAIIVFY-LLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGF 56 

I I : I I | | : : | | | : | | | : | : : M : : : I I : I 

Db 18 AFPQKGLEPGDIAVLVLYFLFVLAVGLWSTVKTK RDTVKGY FLAG GDMVWWP VGA 72 

Qy 57 TMTATWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLI-LGGLF FAKPMRS 108 

: : I : I I I : I I I : t I I : I I I I I I = 

Db 73 SLFASNVGSGHF IGLAGSSAATGISVSAYELNGLFSVLMLAWIFLPIYI 121 



Qy 109 KGYVTMLDPFQQIYGKRMGGLLFIPALMGEMFWAAAIFSAL GAT-ISVIIDVDM 161 

III: : : till: I I : : I I : : II I : : I : 

Db 122 AGQVTTMPEYLR KRFGGIR-IPIILAVLYLFIYIFTKISVDMYAGAIFIQQSLHLDL 177 

Qy 162 HI S VI I SALI ATLYTLVGGL YSVAYTDVVQLFCI FVGLWI S VP FALSHPAVADI GFTAVH 221 

: : : : : I : I I : I I I : I I I I : I : : I : : I I I I : 

Db 178 YLAIVGLLAITAVYTVAGGLAAVIYTDALQTLIMLIGALTLMGY — SFAAVG--GMEGLK 233 

Qy 222 AKY QKPWLGTVDSSEVYS-WLDSFLLLML 24 9 

|| III: : I I 
Db 234 EKYFLALASNRSENSSCGLPREDAFHIFRDPLTSDLPWPGVLFGMSIPSLWY 285 

Qy 250 GGIPW QAYFQRVLS S S SAT YAQVLS FLAAFGCLVMAI PAI LI GAI GASTDWNQTAYG 306 

| | | | | : : : : : | : : : | | : : : : I : : I I 

Db 286 WCTDQVIVQRTLAAKNLSHAKGGALMAAYLKVLPLFIMVFPGMVSRILFPDQVA-- 339 

Qy 307 LPDPKTTEE ADMI L P I VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SM 358 

II::: : | : | : : II : : : I I : I I I I III:: 

Db 340 CADPEICQKICSNPSGCSDIAYPKLVLELLPTGLRGLMMAVMVAALMSSLTSIFNSASTI 399 

Qy 359 FARNIYQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLW 406 

I : : : I I I : I I : : I I : I II III 

Db 400 FTMDLWN-HLRPRASEKELMIVGRVFV LLLVLVS I LWI PWQASQGGQL 447 

Qy 407 — YLSSDLVYI VIFPQLLCVLFVKGTNTYGAVAGYVSGLFLRITG-GEPYLYLQP 458 

| : | | : I : I : I I I I I I I : I I I I : : : I : I I 

Db 448 FIYIQSISSYLQPPVAWF IMGCFWKRTNEKGAFWGLISGLLLGLVRLVLDFIYVQP 504 

Qy 459 LIFYPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLAKYLFESGTLPPKLDV 513 

||: : : : : I : I : I I : I : : : III : : 

Db 505 RC DQPDERPVLVKS I H YLYFSMI LSTVTLITVSTVSWF TEPPSKEMVSHLT 555 

Qy 514 FDAWARHSEENMDKTILVKNENIKLD ELALVKPRQSMTLSSTFTNKEA 562 

|||: I : : I : : : I : III II:: 

Db 556 WFTRHDPWQKEQAPPAAPLSLTLSQNGMPEASSSSSVQFEMVQENTSKTHSCDMTPKQS 615 



RESULT 12 

US-10-156-761-12818 

Sequence 12818, Application US/10156761 
Publication No. US20030119018A1 
GENERAL INFORMATION: 
APPLICANT: OMURA, SATOSHI 
APPLICANT: IKEDA, HARUO 
APPLICANT: ISHIKAWA, JUN 
APPLICANT: HORIKAWA, HIROSHI 
APPLICANT: SHIBA, TADAYOSHI 
APPLICANT: SAKAKI , YOSHIYUKI 
APPLICANT: HATTORI , MAS AH IRA 
TITLE OF INVENTION: NOVEL POLYNUCLEOTIDES 
FILE REFERENCE: 249-262 

CURRENT APPLICATION NUMBER: US/ 10/ 156, 7 61 
CURRENT FILING DATE: 2002-05-29 
PRIOR APPLICATION NUMBER: JP 2001-204089 
PRIOR FILING DATE: 2001-05-30 
PRIOR APPLICATION NUMBER: JP 2001-272697 
PRIOR FILING DATE: 2001-08-02 



NUMBER OF SEQ ID NOS : 15109 
SEQ ID NO 12818 
LENGTH: 48 6 
TYPE: PRT 

ORGANISM: Streptomyces avermitilis 
US-10-156-761-12818 

Query Match 10.3%; Score 306; DB 14; Length 486; 

Best Local Similarity 27.7%; Pred. No. 1.3e-19; 

Matches 130; Conservative 84; Mismatches 194; Indels 62; Gaps 20; 

Qy 11 IIVFYLL-ILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGYIN 69 

: I I I I : I : I I I I : : I II : I : III Ml 

Db 7 VIWYLAGMLAMGWWGMRRAKSKSD FLVAGRRLGPAMYS GTMAAI VLGGAS T I 59 

Qy 70 GTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQIYGKRMGGL 129 

I II II I I I I i I I : I I I I : : I I I I 

Db 60 GGVGLGYKYGLSGAWMVFAIGLGL-LALSVFFSARIARLKVY-TVSEMLDLRYGGRAG — 115 

Qy 130 L F I P ALMGEM FWAAAI F SAL GAT I S VI I DVDMHI SVI I SALI ATLYTLVGGLYS 183 

: | : || : |: :||: |: |:: :::|: I |: :||::| 

Db 116 VI S GWMWAYT LMLAVT S T I AY AT I FDVL FDMN RT LAI I LGG S I WAY S T LGGMWS 171 

Qy 184 VAYT D WQ L FC I FVG- LW I S VP FAL S H P AVAD I G FT AVHAK YQKPWLGTVDSSEVY 238 

: I I : I I : | | : : | | : I I I : I : II I III: : : 

Db 172 ITLTDMVQFWKTIGVLLLLLPIAI VKAGGFSAMKAKLPTEYFDP-LG-IGGETIF 225 

Qy 239 SWLDS FLLLMLGGI PWQAYFQRVLS S S SAT YAQVLS FLAAFGCLVMAI PAI LI GAI GAST 298 

: : : | : I : I : I I I : : I I I : : I MM: Ml 
Db 226 TYV LI YT FGMLI GQDI WQRVFTARS DTTAKWGGTVAGT YCLVYALAGAVI G 27 6 

Qy 299 DWNQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSM 358 

|| : || II : : : I I : II Mill:: ::::::: 

Db 277 T AAKVL YP - T L P SAD S AFAT I VK DEL P VGVRGLVLAAALAAVMS T S S GAL I AC AT V 331 

Qy 359 FARNIYQ LSFRQNASDKEIVWVMRITVFVFGASATAMAL-LTKTVYGLWYLSSDLV 413 

: I : Ml: I : I I : : I I : I : I II : I I 

Db 332 ANNDIWSRLRGVS SRK- GDDHDEVRGNRLFI LVMGVAVI CTAIALNDWEALTVAYNLLV 390 

Qy 414 YIVI FPQLLCVLFVKGTNTYGAVAGYVSG LFLRITGG EPYLY 455 

: : I I M : M I M I M : I : I II lit 

Db 391 GGLLVPILGGLLWKRGT-VHGALASVIVGGLAVIGLMATFGILANEPVYY 439 



RESULT 13 
US-10-119-988-12 

; Sequence 12, Application US/10119988 

; Publication No. US20030054453A1 

; GENERAL INFORMATION: 

; APPLICANT: Curtis, Rory A.J. 

; APPLICANT: Chen, Hong 

; APPLICANT: Millennium Pharmaceuticals Inc. 

; TITLE OF INVENTION: 68723, Sodium/ Glucose Cotransporter 

; TITLE OF INVENTION: Family Members and Uses Therefor 

; FILE REFERENCE: MPI 01- 103P1RNM 

; CURRENT APPLICATION NUMBER: US/10/119,988 

; CURRENT FILING DATE: 2002-04-10 



; PRIOR APPLICATION NUMBER: 60/282,764 
; PRIOR FILING DATE: 2001-04-10 
; NUMBER OF SEQ ID NOS : 18 

SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 12 

LENGTH: 664 

TYPE: PRT 
; ORGANISM: Homo sapiens 
US-10-119-988-12 



Query Match 10.3%; Score 306; DB 14; Length 664; 

Best Local Similarity 22.8%; Pred. No. 2e-19; 

Matches 148; Conservative 104; Mismatches 218; Indels 178; Gaps 30; 

Qy 11 HVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGYING 70 

| :::::::: I I : I I : I I t : : I I : | : : I : : I I : I 

D b 32 IVIYFVWMAVGLWAMFST-NRGTV GGFFLAGRSMVWWP I GAS LFASNI GSGHFVG 86 

Qy 71 TAEAVYVPGYGLAWAQAPIGYS LSLILGGLFFAKPMRSK-GYVTMLDPFQQIYGK 124 

| Ml II: | : : | | | | I : I I I I I : I 

Db 87 LA GT GAAS G I AI GG FEWNALVLVWL GW L FV — P I Y I KAGWTM PEYLRK 134 

Qy 125 RMGG LLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SAL I A 172 

IN I I : I : : : i I I | :::::::::: : I 

Db 135 RFGGQRI QVYLSLLSLLLYI FTKI SADI FS GAI F INLALGLNLYLAI FLLLAIT 188 

Qy 173 T L YT LVGGL YS VAYTDWQL FC I FVGLWI S VP FAL S H PAVAD I GFTAVHAKYQK- - PWL - 229 

I I I : I I I : I I I I : I : I I I II hi Mil: 

Db 189 ALYTITGGLAAVI YTDTLQTVIMLVGSLILTGFAFHEVG GYDAFMEK YMKAI PT IV 244 

Qy 230 GTVDSSEVYS-WLDSFLLL MLGGIPW QAYFQRVLSS 264 

I : | : III: : I : I I I II I h 

Db 245 SDGNTTFQEKCYTPRADSFHIFRDPLTGDLPWPGFIFGMSILTLWYWCTDQVIVQRCLSA 304 

Qy 265 SSATYAQ VLS FLAAFGCLVMAI PAI L IGAI GASTDWNQT 303 

: : : : : : I : I : I : : I : I 

Db 305 KNMSHVKGGCILCGYLKLMPMFIMVMPGMISRILYTEKIACWPSECEKYCGTKVGCTNI 364 

Qy 304 AYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFARNI 363 

|| | : : | | : I : I : : I I I I I I I : : I : I 

Db 365 AY PTLVVELMPNGLRGLMLSVMLAS LMS SLT S I FNSASTLFTMDI 4 09 

Qy 364 YQLSFRQNASDKEIWWIRITVFV-FGASATAMALLTKTVYG — LWYLSSDLVYI VI F 418 

t | : | | : I I : : I : : I II : : : I hi h I 

Db 410 Y-AKVRKRASEKELMIAGRLFILVLIGISIAWVPIVQSAQSGQLFDYIQSITSYLGPPIA 468 

Qy 419 PQLLCVLFVKGTNTYGAVAGYVSGLFLRI TG GEPYLY 455 

I : I I I I I I : I I : I I I I Ml 

Db 469 AVFLLAIFWKRVNEPGAFWGLILGLLIGISRMITEFAYGTGSCMEPSNCPTIICGVHYLY 528 

Qy 456 LQPLIFYPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFD 515 

: : | I h I : I I I I : I : = ' 

Db 52 9 FAIILF AISFITIWISLLTKPI PDVHLYR 558 

Qy 516 AV- - VARHSEENMDKTI LVKNENI KLDELALVKPRQSMTLS ST FTNKE 561 

: | | : | : : | | | : | : : : : : : I : 

Db 559 LCWSLRNSKEERID— LDAEEENIQ EGPKETIEIETQVPEKK 598 



RESULT 14 
US-09-928-530-2 

Sequence 2, Application US/09928530 
Patent No. US20020156002A1 
GENERAL INFORMATION: 
APPLICANT: Curtis, Rory A. J. 
APPLICANT : Silos-Santiago, Inmaculada 

TITLE OF INVENTION: 32 620, A NOVEL HUMAN SODIUM- SUGAR 
TITLE OF INVENTION: SYMPORTER FAMILY MEMBER AND USES THEREOF 
FILE REFERENCE: 10446-080001 
CURRENT APPLICATION NUMBER: US/09/928,530 
CURRENT FILING DATE: 2001-08-13 
PRIOR APPLICATION NUMBER: 60/227,068 
PRIOR FILING DATE: 2000-08-22 
NUMBER OF SEQ ID NOS : 7 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 2 
LENGTH: 675 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-928-530-2 

Query Match 10.0%; Score 298.5; DB 9; Length 675; 

Best Local Similarity 22.7%; Pred. No. l.le-18; 

Matches 149; Conservative 115; Mismatches 238; Indels 155; Gaps 29; 

Qy 2 AFHVEGL IAIIVFY-LLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGF 56 

I I : I I I I : : I I I : I II : I : : I I : : I : II 

Db 18 AFPQKGLEPGDIAVLVLYFLFVLAVGLWSTVKTKR DTVKGYFLAEGNMVWWPVGA- 72 

Qy 57 TMT AT WVG GG Y I N GT AEAVYVP G Y GLAWAQ AP I G Y S L S LILGGLFFAKPMRSKGY 111 

: : I : I I I : I I III : I h I : I : I I : I 

Db 73 S LFASNVGS GHFI GLA GSGAATGISVSAYELNGLFSVLMLAWIFL — PIYIAGQ 124 

Qy 112 VTMLDP FQQI YGKRMGGLL FI PALMGEMFWAAAI FSALGAT I SVIID VDMHIS 164 

||: : : MM: I I : z : : I I : : : : : I : I : : : : 

Db 125 VTTMPE YLR — - KRFGGI R- 1 P 1 1 LAVLYLFI YI FTKI S VDMYAGAI FIQQS SHLDLYLA 180 

Qy 165 VI I SAL I AT L YT LVGGL Y S VAYT D WQL FC I FVGLWI S VP FALS H P AVAD I GFT AVHAK Y 224 

: : I : I I : I I I : I I I I : I : : I : : I II I : I I 

Db 181 I VGLLAI TAVYTVAGGLAAVI YT DALQTLIML I GALTLMGY — S FAAVG- - GMEGLKEKY 236 

Qy 225 QKPWLGTVDSSEVYS-WLDSFLLLMLGGI 252 

III: : I I 

Db 237 FLALASNRSENSSCGLPREDAFHIFRDPLTSDLPWPGVLFGMSIPSLWY 285 

Qy 253 PW QAYFQRVL S S S S AT YAQVL S FLAAFGC LVMAI PAI L I GAI GAS T DWNQT AYGL P D 309 

I I | | | : : : : : | : : : | | : : : : I : : I I I 

Db 286 -WCTDQVIVQRTLAAKNLSHAKGGALMAAYLKVLPLFIMVFPGMVSRILFPDQVA — CAD 342 

Qy 310 PKTTEE ADMI LPI VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFAR 361 

| : : : : I : I : : I I : : : MUM I I M : : I 

Db 343 PEICQKICSNPSGCSDIAYPKLVLELLPTGLRGLMMAVMVAALMSSLTSIFNSASTIFTM 402 



Qy 



362 NIYQLS FRQNAS DKE I VWVMRI T VFVFGAS ATAMALLTKTVYGLW- 



Y 407 



i i i • i i • • i i • i ii iii i 

Db 4 03 DLWN-HLRPRASEKELMIVGRVFV LLLVLVS I LWI P WQASQGGQLFI Y 450 

Qy 408 LSSDLVYI — VIFPQLLCVLFVKGTNTYGAVAGYVSGLFLRITG-GEPYLYLQPLI F 461 

: I I : 1:1 : I I I I I I I : I I I I : : : I : I I 

Db 451 IQSISSYLQPPVAWF IMGCFWKRTNEKGAFWGLI SGLLLGLVRLVLDFI YVQPRC- 506 

Qy 4 62 YPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLAKYLFESGTLPPKLDV 513 

II: : : : : I : I : I I : I : : : III : : 

Db 507 DQPDERPVLVKSIHYLYFSMILSTVTLITVSTVSWF TEPPSKEMVSHLTWFT 558 

Qy 514 - FDAWARH S EENMDKT I LVKNENI KLD ELALVKPRQSMTLSSTFTNKEA 562 

Ml: | : : | : : : I : III I I : : 

Db 559 RHDPWQKEQAPPAAPLSLTLSQNGMPEAS S S S SVQFEMVQENT SKTHSCDMT PKQS 615 



RESULT 15 
US-10-162-012-27 

; Sequence 27, Application US/10162012 

; Publication No. US20030051660A1 

; GENERAL INFORMATION: 

; APPLICANT: Curtis, Rory A.J. 

; APPLICANT: Silos-Santiago, Inmaculada 

; APPLICANT: Gu, Wei 

; TITLE OF INVENTION: NOVEL HUMAN ION CHANNEL AND TRANSPORTER FAMILY MEMBERS 

; FILE REFERENCE: 10448-190001 

; CURRENT APPLICATION NUMBER: US/ 10/162 , 012 

; CURRENT FILING DATE: 2002-06-04 

; PRIOR APPLICATION NUMBER: US 60/209,845 

; PRIOR FILING DATE: 2000-06-06 

; PRIOR APPLICATION NUMBER: US 09/875,321 

PRIOR FILING DATE: 2001-06-06 
; PRIOR APPLICATION NUMBER: PCT/US01/ 18340 
; PRIOR FILING DATE: 2001-06-06 
; PRIOR APPLICATION NUMBER: US 60/209,257 
; PRIOR FILING DATE: 2000-06-05 
; PRIOR APPLICATION NUMBER: US 09/875,423 
; PRIOR FILING DATE: 2001-06-05 
; PRIOR APPLICATION NUMBER: PCT/US0 1/ 18398 
; PRIOR FILING DATE: 2001-06-05 
; PRIOR APPLICATION NUMBER: US 60/209,238 
; PRIOR FILING DATE: 2000-06-05 
; PRIOR APPLICATION NUMBER: US 09/875,363 

PRIOR FILING DATE: 2001-06-05 
; PRIOR APPLICATION NUMBER: PCT/US01/ 182 47 

PRIOR FILING DATE: 2001-06-05 
; PRIOR APPLICATION NUMBER: US 60/227,068 
; PRIOR FILING DATE: 2000-08-22 
; PRIOR APPLICATION NUMBER: US 09/928,530 
; PRIOR FILING DATE: 2001-08-13 
; PRIOR APPLICATION NUMBER: PCT/US01/254 75 
; PRIOR FILING DATE: 2001-08-15 
; PRIOR APPLICATION NUMBER: US 60/226,770 
; PRIOR FILING DATE: 2000-08-21 
; PRIOR APPLICATION NUMBER: US 09/934,421 
; PRIOR FILING DATE: 2001-08-21 
; PRIOR APPLICATION NUMBER: PCT/US01/26096 



PRIOR FILING DATE: 2001-08-21 
PRIOR APPLICATION NUMBER: US 60/279,281 
PRIOR FILING DATE: 2001-03-28 
PRIOR APPLICATION NUMBER: US 10/109,029 
PRIOR FILING DATE: 2002-03-28 
PRIOR APPLICATION NUMBER: PCT/US02/ 0972 8 
PRIOR FILING DATE: 2002-03-28 
PRIOR APPLICATION NUMBER: US 60/290,288 
PRIOR FILING DATE: 2001-05-11 
PRIOR APPLICATION NUMBER: US (not assigned) 
PRIOR FILING DATE: 2002-05-13 
NUMBER OF SEQ ID NOS : 4 8 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 2 7 
LENGTH: 675 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-162-012-27 

Query Match 10.0%; Score 298.5; DB 14; Length 675; 

Best Local Similarity 22.7%; Pred. No. l.le-18; 

Matches 149; Conservative 115; Mismatches 238; Indels 155; Gaps 29; 

Qy 2 AFHVEGL 1 AI I VFY-LLI LLVGI WAAWRT KNS GSAEERS EAI I VGGRDI GLLVGGF 56 

I I : I t I i : : I I I : I I I : I : : I I : : I : II 

Db 18 AFPQKGLEPGDIAVLVLYFLFVLAVGLWSTVKTKR DT VK G Y FLAE GNMVWW P VGA- 72 

Qy 57 TMTATWVGGGYINGTAEAVYVPGYGLAWAQAP I GYS LS LI LGGLFFAKPMRSKGY 111 

:: I: II I: I I III : I I • hi :l Is I 

Db 73 SLFASNVGSGHFIGLA GSGAATGISVSAYELNGLFSVLMLAWIFL — PI YIAGQ 124 

Qy 112 VTMLDP FQQ I YGKRMGGLLFI PALMGEMFWAAAI FS ALGAT I SVIID VDMHIS 164 

||::: I I I I : I I : : : : I I : : : : : I : | : : : : 

Db 125 VTTMPEYLR KRFGGIR-IPIILAVLYLFIYIFTKISVDMYAGAIFIQQSSHLDLYLA 180 

Qy 165 VII SAL I AT L YT LVGG L Y S VAYT D WQL FC I FVGLWI S VP FAL S H PAVAD I GFT AVHAK Y 224 

: : I : I I : I I I : I I I I : I : : I : : I II I : I I 
Db 181 I VGLLAI TAVYTVAGGLAAVI YTDALQT LIMLI GALTLMGY — S FAAVG — GMEGLKEK Y 236 

Qy 225 QKPWLGTVDSSEVYS-WLDSFLLLMLGGI 252 

III: : I I 

Db 237 FLALASNRSENSSCGLPREDAFHIFRDPLTSDLPWPGVLFGMSIPSLWY 285 

Qy 253 PW QAYFQRVLS S S SAT YAQVLS FLAAFGCLVMAI PAI LI GAI GASTDWNQTAYGLPD 309 

| | | | | : : : : : | : : : | | : : : : I : : I I I 

Db 286 - WCT DQVI VQRT LAAKN L S HAKGGALMAAYLKVL P L FI MVFP GMVS RI L FP DQVA — CAD 342 

Qy 310 PKTTEE ADMI LP I VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFAR 361 

I : : : : I : I : : II : : : I I : I I I I I I I : : I 

Db 343 PEICQKICSNPSGCSDIAYPKLVTiELLPTGLRGLMMAVKVAALMSSLTSIFNSASTIFTM 4 02 

Qy 362 NIYQLSFRQNASDKEIWVMRITVFVFGASATAMALLTKTVYGLW Y 4 07 

: : : I I I : I I : : I I : I II III I 

Db 403 DLWN-HLRPRASEKELMIVGRVFV LLLVLVSILWIPWQASQGGQLFI Y 450 

Qy 408 LSSDLVYI VI FPQLLCVLFVKGTNTYGAVAGYVSGLFLRITG-GEPYLYLQPLI F 461 

: I I : |:| : I I I I I I I : I I I I : : : I : I I 



Db 451 IQSISSYLQPPVAWF IMGCFWKRTNEKGAFWGLISGLLLGLVRLVLDFI WQPRC- 506 

Qy 462 YPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLAKYLFESGTLPPKLDV 513 

II: : : : : | : | : I I : I : : : | | I - 

Db 507 DQPDERPVLVKSIHYLYFSMILSTVTLITVSTVSWF TEPPSKEMVSHLTWFT 558 

Qy 514 - FD AWARH S E ENMD KT I LVKN EN I KL D ELALVKPRQSMTLSSTFTNKEA 562 

Ml: | : : | : : : I : | | I I I : : 

Db 559 RHDPWQKEQAPPAAPLSLTLSQNGMPEASSSSSVQFEMVQENTSKTHSCDMTPKQS 615 



Search completed: September 28, 2004, 17:20:45 
Job time : 134 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 



Run on: 



Title: 

Perfect score: 
Sequence: 

Scoring table: 



Searched: 



September 28, 2004, 16:40:44 ; Search time 126 Seconds 

(without alignments) 
1452.385 Million cell updates/sec 

US-10-069-541-6 
2972 

1 MAFHVEGLIAIIVFYLLILL EAFLDVDSSPEGSGTEDNLQ 580 

BLOSUM62 

Gapop 10.0 , Gapext 0 . 5 

1017041 seqs, 315518202 residues 



Total number of hits satisfying chosen parameters: 1017041 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database 



SPTREMBL 25:* 



1 

2 
3 
4 
5 
6 
7 
8 
9 

10 
11 
12 
13 
14 
15 
16 
17 



sp_archea : * 
sp_bacteria: * 
sp_fungi : * 
sp_human: * 
sp_invertebrate : * 
sp_mammal : * 
sp_mhc : * 
sp_organelle : * 
sp_phage : * 

sp_plant : * 

sp_rodent : * 

sp_virus : * 

sp_vertebrate : * 

sp_unclassif ied: * 

sp_rvirus : * 

sp_bacteriap : * 

sp_archeap : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
Q9GZV3 

ID Q9GZV3 PRELIMINARY; PRT; 580 AA. 

AC Q9GZV3; 

DT 01-MAR-2001 (TrEMBLrel. 16, Created) 

DT 01-MAR-2001 (TrEMBLrel. 16, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 



DE High affinity choline transporter (High-affinity choline transporter 

DE CHT1) . 

GN CHT1 OR SLC5A7. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9 606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Hypothalamus ; 

RA Bruess M. ; 

RL Submitted (AUG-2000) to the EMBL/ GenBank/DDBJ databases. 

RN [2] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Hypothalamus ; 

RA Wieland A., Bonisch H. , Bruss M. ; 

RT "Molecular cloning of the human and murine high affinity choline 

RT transporters and characterization of the human gene-structure."; 

RL Submitted (AUG-2000) to the EMBL/ GenBank/DDBJ databases. 

RN [3] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=204 83599; PubMed=l 10275 60 ; 

RA Apparsundaram S., Ferguson S.M., George A.L. Jr., Blakely R.D.; 

RT "Molecular cloning of a human, hemicholinium-3-sensitive choline 

RT transporter . " ; 

RL Biochem. Biophys. Res. Commun. 276:862-867(2000). 

RN [4] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Spinal cord; 

RX PubMed=11068039; 

RA Okuda T., Haga T.; 

RT "Functional characterization of the human high-affinity choline 

RT transporter . " ; 

RL FEBS Lett. 4 84:92-97(2000). 

RN [5] 

RP SEQUENCE FROM N.A. 

RA Bruess M. ; 

RL Submitted (JAN-2001) to the EMBL/ GenBank/DDBJ databases. 

RN [6] 

RP SEQUENCE FROM N.A. 

RA Wieland A., Bonisch H., Bruss M. ; 

RT "Molecular cloning of the human and murine high affinity choline 

RT transporters and characterizationof the human gene structure."; 

RL Submitted (JAN-2002) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; AJ401466; CAC03717.1; -. 

DR EMBL; AF276871; AAG25940.1; -. 

DR EMBL; AB043997; BAB18161.1; 

DR EMBL; AJ308378; CAC88115.1; 

DR EMBL; AJ308379; CAC88115.1; JOINED. 

DR EMBL; AJ308380; CAC88115.1; JOINED. 

DR EMBL; AJ308381; CAC88115.1; JOINED. 

DR EMBL; AJ308382; CAC88115.1; JOINED. 

DR EMBL; AJ308383; CAC88115.1; JOINED. 

DR EMBL; AJ308384; CAC88115.1; JOINED. 

DR PIR; JC7502; JC7502. 

DR Genew; HGNC: 14025; SLC5A7 . 

DR GO; GO:0005624; C:membrane fraction; NAS. 



DR GO; GO: 0015220; F: choline transporter activity; TAS . 

DR GO; GO:0008292; P : acetylcholine biosynthesis; NAS . 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR PROSITE; PS50283; NA_SOLUT_SYMP_3 ; 1. 

SQ SEQUENCE 580 AA; 63203 MW; 66CB354 96CB6E2D6 CRC64; 

Query Match 100.0%; Score 2972; DB 4; Length 580; 

Best Local Similarity 100.0%; Pred. No. 5.3e-205; 

Matches 58 0; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 
I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I M I I I I I I I I I 
MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

TWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 12 0 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I 
TWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 12 0 

I YGKRMGGLLFI PALMGEMFWAAAI F SAL GAT I SVI I DVDMHI SVI I SALI ATLYTLVGG 180 
I I I I I I I II II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II II I I I I I I I I 
I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALI ATLYTLVGG 180 

L Y S VAYT DWQ L F C I FVGLW I S VP FAL S H P AVAD I G FT AVHAK YQ K P W L GT VD S S E VY S W 240 
I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I \ I I I I I I I I I I I I 



I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 1 I 

LDS FLLLMLGGI PWQAYFQRVLS S S SAT YAQVLS FLAAFGCLVMAI PAI LI GAI GASTDW 300 



Qy 


1 


Db 


1 


Qy 


61 


Db 


61 


Qy 


121 


Db 


121 


Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


301 


Db 


301 


Qy 


361 


Db 


361 


Qy 


421 


Db 


421 


Qy 


481 


Db 


481 


Qy 


541 


Db 


541 



360 



M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 



RNIYQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 42 0 

I I I I I I I I I I I I I I I I I I I I I t I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I M M I 
RNIYQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 420 

LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 4 80 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I M I I M I I I I I I I I I I I I 
LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGI YNQKFPFK 48 0 

TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 540 

I | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I II I I I I I I I I I I I I I I I 
TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 54 0 



I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



RESULT 2 
Q9JMD7 

ID Q9JMD7 PRELIMINARY; PRT; 58 0 AA. 

AC Q9JMD7; 

DT 01-OCT-2000 (TrEMBLrel. 15, Created) 



DT 01-OCT-2000 (TrEMBLrel. 15, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE High-affinity choline transporter CHT1. 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus. 

OX NCBI_TaxID=10116; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Wistar; TISSUE=Spinal cord; 

RX MEDLINE=2 011609 9; PubMed=10649566; 

RA Okuda T . , Haga T., Kanai Y. , Endou H., Ishihara T., Katsura I.; 

RT "Identification and characterization of the high-affinity choline 

RT transporter. "; 

RL Nat. Neurosci. 3:120-125(2000). 

DR EMBL; AB030947; BAA90484.1; -. 

DR GO; GO: 0016020; C:membrane; IEA. 

DR GO; GO: 0005215; F : transporter activity; IEA. 

DR GO; GO: 0006810; P: transport; IEA. 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR PROSITE; PS50283; NA_SOLUT_SYMP_3 ; 1. 

SQ SEQUENCE 580 AA; 63406 MW; B7CB73323DAD17A7 CRC64; 

Query Match 94.9%; Score 2820; DB 11; Length 580; 

Best Local Similarity 93.1%; Pred. No. 4.3e-194; 

Matches 540; Conservative 24; Mismatches 16; Indels 0; Gaps 0; 

MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

I I I I I I I : I I I : I I 1 I I I I I I I I I I : I I I I I : I I I I I I I I I I I I I I I I II I I I I I I I I 
MPFHVEGLVAI I LFYLLI FLVGIWAAWKTKNSGNAEERS EAI I VGGRDI GLLVGGFTMTA 6 0 

TWVGGG YI NGTAEAVYVP GYGLAWAQAP I GYS LS LI LGGL FFAKPMRS KGYVTMLDP FQQ 120 

I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I M I I I I I I I I I II I I I I I I I 

TWVGGGYI NGTAEAVYGPGCGLAWAQAP I GYS LS LI LGGLFFAKPMRS KGYVTMLDP FQQ 12 0 

I YGKRMGGLLFI P ALMGEM FWAAAI FSALGATI SVI I DVDMHI S VI I SALI ATLYTLVGG 18 0 
I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I : : I I I I : I I I I I I I I I I I I 
I YGKRMGGLLFI P ALMGEM FWAAAI FSALGATI SVI I DVDVNI SVIVSALI AI LYTLVGG 18 0 

LYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSW 24 0 

I I I I I II I I I I I I I I I : I I I I I I I I I I I I I I I I II I II I I I I I I II I I I :: I I I I : I 
LYSVAYTDWQLFCIFIGLWISVPFALSHPAWDIGFTAV71AKYQSPWLGTIESVEVYTW 24 0 

LDS FLLLMLGGI PWQAYFQRVLS S S S AT YAQVL S FLAAFGCLVMAI PAI LI GAI GASTDW 300 
I I : I I I I I I I I I I I I I II I i I I II I I I II I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I 
LDNFLLLMLGGI PWQAYFQRVLS S S SAT YAQVLS FLAAFGCLVMALPAI CI GAI GASTDW 300 

NQTAYGLPDPKTTEEADMI LP I VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFA 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I II I I I I II I I I I I I I I I 
NQTAYGFPDPKTKEEADMILPIVLQYLCPVYIS FFGLGAVSAAVMS SADS SI LSAS SMFA 360 

RN I YQLS FRQNAS DKE I VWVMRI TVFVFGAS ATAMALLT KT VYGLWYL S S DLVYI VI FPQ 420 

I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I 
RN I YQLS FRQNAS DKEI VWVMRI TVFVFGAS AT AMALLTKTVYGLWYLSS DLVYI 1 1 FPQ 42 0 



Qy 
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Db 
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Qy 


61 


Db 


61 


Qy 


121 


Db 


121 


Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


301 


Db 


301 


Qy 


361 


Db 


361 



Qy 



421 LLCVLFVKGTNTYGAVAGWSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 4 80 



I I ! I I I : II I I I I I I I I II : I I I I I I I I I I I I I I I I I II I I I I I I I I lllllhllll 
Db 421 LLCVLFIKGTNTYGAVAGYIFGLFLRITGGEPYLYLQPLIFYPGYYPDKNGIYNQRFPFK 480 

Qy 481 TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILV 540 

I I : I I I I I I I I I : I I I I I I I I I I II I I I I I I : I I I I I : I I I I I I I I I I I I I : I I I I I I : 
Db 481 TLSMVTSFFTNICVSYLAKYLFESGTLPPKLDIFDAWS RHSEENMDKTILVRNENIKLN 540 

Qy 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 

III I I I I I I : I I I I I I I I I I I I I I I I I I I I II I I I II I 
Db 541 ELAPVKPRQSLTLSSTFTNKEALLDVDSSPEGSGTEDNLQ 580 



RESULT 3 
Q8BGY9 

ID Q8BGY9 PRELIMINARY; PRT; 580 AA. 

AC Q8BGY9; 

DT 01-MAR-2003 (TrEMBLrel. 23, Created) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Solute carrier family 5. 

GN SLC5A7 . 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TI SSUE=Diencephalon, and Head; 

RX MEDLINE=22354 683; PubMed=124 668 51 ; 

RA The FANTOM Consortium, 

RA the RIKEN Genome Exploration Research Group Phase I & II Team; 

RT "Analysis of the mouse transcriptome based on functional annotation of 

RT 60,770 full-length cDNAs . " ; 

RL Nature 42 0:563-573(2 002). 

DR EMBL; AK034415; BAC28702.1; -. 

DR EMBL; AK053063; BAC35253.1; -. 

DR MGD; MGI : 1927126; Slc5a7. 

DR GO; GO: 0016020; C:membrane; IEA. 

DR GO; GO: 0005215; F : transporter activity; IEA. 

DR GO; GO:0006810; P:transport; IEA. 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR PROSITE; PS50283; NA_SOLUTJ3YMP_3; 1. 

SQ SEQUENCE 580 AA; 63364 MW; 6154CE6622772A41 CRC64; 

Query Match 94.4%; Score 2806; DB 11; Length 580; 

Best Local Similarity 92.9%; Pred. No. 4.3e-193; 

Matches 539; Conservative 23; Mismatches 18; Indels 0; Gaps 0 

Qy 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

I : I I I I I I : I I I : I I I I I II I I I I I I : I I I I I : I I I I I I I I I I I I I II I I I I I I I I I I 
Db 1 MSFHVEGLVAIILFYLLIFLVGIWAAWKTKNSGNPEERSEAIIVGGRDIGLLVGGFTMTA 60 

Qy 61 TWGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 12 0 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 61 TWVGGGYINGTAEAVYGPGCGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 12 0 



Qy 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGAT I SVI I DVDMHI SVI I SALI ATLYTLVGG 180 

II I I I I II I I I I I I I 1 I I I I II I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I II 
Db 121 I YGKRMGGLLFI PALMGEMFWAAAIFSALGAT I SVI I DVDVNISVIVSALIAILYTLVGG 180 

Qy 181 LYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSW 240 

I I I I I I I I I I I I II I I : I I I I I I I I I I II I I I MINIMUM I II II : : I II I : I 

Db 181 LYSVAYTDWQLFCIFIGLWISVPFALSHPAVTDIGFTAVHAKYQSPWLGTIESVEVYTW 240 

Qy 241 L D S FL L LML G G I P WQAY FQ RVL S S S SAT YAQVL S FLAAF G C LVMAI P AI L I GAI GAS T D W 3 00 

II : II I II I I M I I I I I I II I I I I I I I M I II II I I I II II I M I : I II I I I II I I I I I 

Db 241 L DN FL L LML GG I P WQAY FQ RVL S S S SAT YAQVL S FLAAF GC LVMAL P AI C I GAI GAS T DW 300 

Qy 301 NQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 360 

MINI Mill I II I II I II M II II I I M I M II I I II II II II M II I M II II I I 

Db 301 NQT AYG Y P D P KT KE EADMI L P I VLQ YLC P VYI S F FGLGAVS AAVMS SAD S S I L S AS SMFA 360 

Qy 361 RNI YQLS FRQNASDKEI VWVMRITVFVFGASATAMALLTKTVYGLWYLS SDLVYI VI FPQ 420 

I II I I I I II II I M M I I M M II I I M I I I M I II I I I II I I I I I I I II M I I : II I I 

Db 361 RNI YQLS FRQNASDKEI VWVMRITVLVFGAS AT AMALLTKTVYGLWYLS SDLVYI 1 1 FPQ 420 

Qy 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 48 0 

M I I It :\ I I I I I M II II : I I I I I I I I I I I I I I II M I II M M I I II M I : I M I 
Db 421 LLCVLFIKGTNTYGAVAGYIFGLFLRITGGEPYLYLQPLIFYPGYYSDKNGIYNQRFPFK 480 

Qy 481 TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 540 

11:11111 I I M : M I M I I I M II II I I II I I I M II II II II I II I M I : M II I I : 
Db 481 TLSMVTSFFTNICVSYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVRNENIKLN 540 

Qy 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 

III II II I I : I I I I I I I II II I I I I II I II II II I I I I 

Db 541 ELAPVKPRQSLTLSSTFTNKEALLDVDSSPEGSGTEDNLQ 580 

RESULT 4 
Q99PK3 



ID Q99PK3 PRELIMINARY; PRT; 580 AA. 

AC Q99PK3; 

DT 01-JUN-2001 (TrEMBLrel. 17, Created) 

DT 01-JUN-2001 (TrEMBLrel. 17, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24 , Last annotation update) 

DE Sodium and chloride-dependent high-affinity choline transporter. 

GN SLC5A7 . 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Apparsundaram S., Ferguson S.M., Blakely R.D.; 

RT "Molecular cloning and characterization of human and murine high- 

RT affinity choline transporters."; 

RL Submitted (FEB-2001) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; AF276872; AAG36945.2; -. 

DR MGD; MGI : 1927126; Slc5a7. 

DR GO; GO: 0016020; Crmembrane; IEA. 

DR GO; GO: 0005215; F : transporter activity; IEA. 

DR GO; GO: 0006810; P: transport; IEA. 



DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR PROSITE; PS50283; NA_S0LUT_SYMP_3 ; 1. 

SQ SEQUENCE 580 AA; 63383 MW; DDBF58ED428270AF CRC64; 

Query Match 94.0%; Score 2795; DB 11; Length 580; 

Best Local Similarity 92.6%; Pred. No. 2.7e-192; 

Matches 537; Conservative 23; Mismatches 20; Indels 0; Gaps 0; 

MAFHVEGLI AI I VFYLLI LLVGI WAAWRTKNS GSAEERS EAI I VGGRDI GLLVGGFTMTA 60 
I I I I II I : I I I : I I I I I I I I I I I I I : II I I I : I II I I I I I I I I I I I I I I I I I I I I I I 
MP FHVEGLVAI I LFYLLI FLVGI WAAWKTKNS GNPEERS EAI I VGGRDI GLLVGGFTMTA 60 

T WVGGG Y I N GT AEAVYVP G YGLAWAQAP I G Y S L S L I LGGL FFAKPMRS KG YVTML D P FQQ 12 0 

I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I : I 
TWVGGGYINGTAEAVYGPGCGLAWAHAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFKQ 12 0 

I YGKRMGGLLFI PALMGEMFWAAAI F SAL GAT I SVI IDVDMHI SVI I SALI ATL YT LVGG 18 0 
I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I : I I I I I I I I I I I I 
I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDVNI SVI VSALI AI LYTLVGG 18 0 

L Y S VAYT DWQ L FC I FVGLW I S VP FAL S H P AVAD I G FTAVHAK YQK PWL GT VD S S E VY S W 24 0 
I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I :: I I I I : I 
LYSVAYTDWQLFCIFIGLWISVPFALSHPAWDIGFTAVHAKYQSPWLGTIESVEVYTW 24 0 

LDS FLLLMLGGI PWQAYFQRVLS S S S AT YAQVLS FLAAFGCLVMAI PAI LI GAI GAST DW 300 

I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I II I I I I I I I 
LDNFLLLMLGGI PWQAYFQRVLS SSSATYAQVLSFLAAFGCLVMALPAI CI GAI GASTDW 300 

NQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 360 
I I I II I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I 
NQTAYGYPDPKTKEEADMI LPI VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFA 360 

RN I YQ L S FRQN AS DKE I WVMRI T VFVFGAS AT AMALLT KT VYGLW YL S S DLVYIVI FPQ 42 0 
I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I 
RN I YQLS FRQNAS DKE I VWVMRI TVLVFGAS AT AMALLT KTVYGLWYLS S DLVYI 1 1 FPQ 42 0 

LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 480 

I I I I I I : I I II I I I I I I I I : I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I : I I II 
LLCVLFIKGTNTYGAVAGYIFGLFLRITGGEPYLYLQPLIFYPGYYSDKNGI YNQRFPFK 48 0 

TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 54 0 

11:11111 I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I : 
TLSMVTSFFTNICVSYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVRNENIKLN 540 

ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 58 0 

III I I I I I I : I I I II I I I I II I I II I I I I I I I I II I I I 
ELAPVKPRQSLTLSSTFTNKEALLDVDSSPEGSGTEDNLQ 580 



Qy 


1 


Db 


1 


Qy 


61 


Db 


61 


Qy 


121 


Db 


121 


Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


301 


Db 


301 


Qy 


361 


Db 


361 


Qy 


421 


Db 


421 


Qy 


481 


Db 


481 


Qy 


541 


Db 


541 



RESULT 5 
Q9ESW5 

ID Q9ESW5 PRELIMINARY; PRT; 580 AA. 

AC Q9ESW5; 

DT 01-MAR-2001 (TrEMBLrel. 16, Created) 

DT 01-MAR-2001 (TrEMBLrel. 16, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 



DE High affinity choline transporter. 

GN SLC5A7 OR CHT1. 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=BALB/cJ; TISSUE=Brain stem; 

RA Bruess M. ; 

RL Submitted (AUG-2000) to the EMBL/ GenBank/ DDB J databases. 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN=BALB/cJ; TISSUE=Brain stem; 

RA Wieland A., Bonisch H., Bruss M. ; 

RT "Molecular cloning of the human and murine high affinity choline 

RT transporters and characterization of the human gene-structure."; 

RL Submitted (AUG-2000) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; AJ401467; CAC03719.1; -. 

DR MGD; MGI: 1927126; Slc5a7 . 

DR GO; GO: 0016020; Cimembrane; IEA. 

DR GO; GO:0005215; F: transporter activity; IEA. 

DR GO; GO: 0006810; P: transport; IEA. 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR PROSITE; PS50283; NA_S0LUT_SYMP_3 ; 1. 

SQ SEQUENCE 580 AA; 63331 MW; A4 Fl 3 8 7 CAA9 EAAFE CRC64; 

Query Match 93.9%; Score 2791; DB 11; Length 580; 

Best Local Similarity 92.4%; Pred. No. 5.1e-192; 

Matches 536; Conservative 24; Mismatches 20; Indels 0; Gaps 0; 

MAFHVEGL I AI I VFYLL I LLVGI WAAWRTKN S GS AEERS EAI I VGGRD I GLLVGGFTMTA 60 

I : I I I I I I : I I I : I I I I I I I I I I t II : I I I I I : II I I I I II I I I I I I II I I I I I I I I 
MS FHVEGLVAI I LFYLLIFLVGIWAAWKTKNSGNPEEHSEAIIVGGRDI GLLVGGFTMTA 60 

TWGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 

I I I I I I I II I II III II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
TWVGGGYINGTAVAVYGPGCGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 

I YGKRMGGLLFI PALMGEMFWAAAI F SAL GAT I S VI I DVDMH I SVI I SALI ATLYTLVGG 180 
M M I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I :: I I I I : I I I I I I I I I I I I 
I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI S VI I DVDVNI S VI VSALI AI LYTLVGG 18 0 

LYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSW 24 0 

I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I' I I I I I I I I I I I I I II I I :: I I I I : I 
LYSVAYTDWQLFCIFIGLWISVPFALSHPAVTDIGFTAVHAKYQSPWLGTIESVEVYTW 24 0 

LDSFLLLMLGGIPWQAYFQRVLSSSSATYAQVLSFLAAFGCLVMAI PAILIGAIGASTDW 300 

I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I II I I I I : I I I I I II I I I I I I 
LDNFLLLMLGGIPWQAYFQRVLSSSSATYAQVLSYLAAFGCLVMALPAICIGAIGASTDW 300 

NQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I M I I I I 
NQTAYGYPDPKTKEEADMI LP I VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFA 360 



Qy 


1 


Db 


1 


Qy 


61 


Db 


61 


Qy 


121 


Db 


121 


Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


301 


Db 


301 



Qy 



361 RNIYQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 420 



I I I II I I I I I I I II I I II I I I I I I I I I I II I I I I I I I II I I I I I I I 111:1111 

Db 361 RNI YQLS FRQNAS DKEI VWVMRI TVLVFGASATAMALLTKTVYGLWYLS SDLVYI 1 1 FPQ 420 

Qy 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 4 80 

I I I II I : I I I I I I I I I I I I : I I I I I I I I I I I II I I II II I I I I I I I 111111:1111 
Db 421 LLCVLFIKGTNTYGAVAGYIFGLFLRITGGEPYLYLQPLIFYPGYYSDKNGIYNQRFPFK 480 

Qy 481 T LAMVT S FLTN I C I S YLAK YL FE S GT L P P KLDVFDAWARH S E ENMDKT I LVKN EN I KLD 540 

II: I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I : I I I I I I : 

Db 481 TLSMVTSFFTNICVSYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVRNENIKLN 540 

Qy 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 

III I I I I I I : I I I I II I I II I I I I I I I I I I I I I I I I I I 
Db 541 ELAPVKPRQSLTLSSTFTNKEALLDVDSSPEGSGTEDNLQ 580 



RESULT 6 
Q8UWF0 

ID Q8UWF0 PRELIMINARY; PRT; 584 AA. 

AC Q8UWF0; 

DT 01-MAR-2002 (TrEMBLrel. 20, Created) 

DT 01-MAR-2002 (TrEMBLrel. 20, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE High affinity choline transporter. 

GN CHT1. 

OS Torpedo marmorata (Marbled electric ray) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Chondrichthyes ; 

OC Elasmobranchii; Squalea; Hypnosqualea ; Pristiora j ea ; Batoidea; 

OC Torpediniformes ; Torpedinoidei ; Torpedinidae; Torpedo. 

OX NCBI_TaxID=7 7 8 8; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Electric lobe; 

RA Guermonprez L., 0' Regan S., Meunier F.M., Morot-Gaudry-Talarmain Y. ; 

RT "Cyclosporin, FK506 and rapamycin inhibit neuronal choline uptake via 

RT calcineurin-dependent and independent mechanisms."; 

RL Submitted (NOV-2001) to the EMBL/ GenBank/ DDB J databases. 

DR EMBL; AJ420808; CAD12727.1; -. 

DR GO; GO: 0016020; Crmembrane; IEA. 

DR GO; GO:0005215; F: transporter activity; IEA. 

DR GO; GO: 0006810; P: transport; IEA. 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR PROSITE; PS502 83; NA_SOLUTJ3YMP_3 ; 1. 

SQ SEQUENCE 584 AA; 63660 MW; 995F937B01 195A3D CRC64; 

Query Match 75.8%; Score 2253; DB 13; Length 584; 

Best Local Similarity 72.4%; Pred. No. 2.1e-153; 

Matches 418; Conservative 83; Mismatches 70; Indels 6; Gaps 2 
Qy 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSG--SAEERSEAIIVGGRDIGLLVGGFTM 58 



Db 1 MTVHI DGI VAI VLFYLLI LFVGLWAAWKS KNTSMEGAMDRS EAIMI GGRDI GLLVGGFTM 60 

Qy 59 TATWGGGYINGTAEAVTVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPF 118 

I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I : I I I :: I I I I I II I I I I : I I I I I I I I I 

Db 61 TATWVGGGYINGTAEAVYVPGYGLAWAQAPFGYALSLVIGGLFFAKPMRSRGYVTMLDPF 12 0 



Qy 119 QQIYGKRMGGLLFIPALMGEMFWAAAIFSALGATISVIIDVDMHISVIISALIATLYTLV 178 

I I : I I I I I I I I I I I I I I : I I : I i : I I I I I I I I I : I I I : I : : : : : t I : : I I : I I I I I I I 
Db 121 QQMYGKRMGGLLFIPALLGEIFWS7^VILSALGATLSVIVDININVSVWSAVIAVLYTLV 18 0 

Qy 179 GGLYSVAYTDWQLFCI FVGLWI SVPFALSHPAVADIGFTAVHAKYQKPWLGTVDSSEVY 238 

I I I I I I I I I I I I I II I I I : I I I I I : I I I I : I I I I I II | | : | | : I : 1 : 

Db 181 GGLYSVAYTDWQLFCI FLGLWIS I PFALLNPAVTDI I WANQEVYQEPWVGNIQSKDSL 240 

Qy 239 SWLDSFLLLMLGGI PWQAYFQRVLSS SSATYAQVLSFLAAFGCLVMAI PAIL I GAIGAST 298 

I : I : I I M I I I I I I I I I I I I I I I : I I I I I I I II I I I I I II I :: I I I I :: I I II I I I I 
Db 241 I WI DNFLLLMLGGI PWQVYFQRVLSAS SATYAQVLS FLAAFGCVLMAI PS VLI GAI GT ST 300 

Qy 299 DWNQT AYGL P D P KT T E EADMI L P I VLQ YLC P VY I S F FGLGAVS AAVMS SAD S S I L S AS SM 358 

|||||:MII I I 111111111:111 I I I i II I I I I I I I I I I I I I I I I I I I I I I 

Db 301 DWNQTSYGLPGPI GKNETDMI LPIVLQHLCP P YI S FFGLGAVSAAVMS SADS S I LSASSM 360 

Qy 359 FARNIYQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIF 418 

I I I I I I I : I I I I I I I I I I I I I I I I : I : I 1 : I I : I I I I : : : II I II I I I I I I I : : I I 
Db 361 FARNIYHLAFRQEASDKEIVWVMRITIFLFGGAATSMALLA 420 

Qy . 419 PQLLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPD DNGIYN 474 

Ml: I I I I I I I I I t I :: I I I : I I I I : I I I I I I :: I I hill I I h : I 

Db 421 PQLISVLFVKGTNTYGSIAGYIIGFLLRISGGEPYLHMQPFIYYPGCYLDHSFGDDPVYV 480 

Qy 475 QKFPFKTIAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKN 534 

I : I I I || : M : I I I I : I I I I I I I II I I II II h I h : I h I I I I I h 
Db 481 QRFPFKTMAMLFSFLGNTGVSYLVKYLFVSGILPPKLDFLDSWSKHSKEIMDKTFLMNQ 540 

Qy 535 ENIKLDELALVKPRQSMTLSSTFTNKEAFLDVDSSPE 571 

: I I I I I II I : : h : : : I I 

Db 541 DNITLSELVHVNPIHSASVSAALTNKEAFEDIEPNPE 57 7 



RESULT 7 
Q8AV27 

ID Q8AV27 PRELIMINARY; PRT; 377 AA. 

AC Q8AV27; 

DT 01-MAR-2003 (TrEMBLrel. 23, Created) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Putative high affinity choline transporter 1 (Fragment) . 

GN CHT1. 

OS Gallus gallus (Chicken) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Archosauria; Aves; Neognathae; Gallif ormes ; Phasianidae; Phasianinae ; 

OC Gallus. 

OX NCBIJTaxID=9031; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Ciliary ganglion; 

RX MEDLINE=22308883; PubMed=12421710; 

RA Mueller F. , Rohrer H.; 

RT "Molecular control of ciliary neuron development: BMPs and downstream 

RT transcriptional control in the parasympathetic lineage."; 

RL Development 12 9:5707-5717(2002). 

DR EMBL; AJ511267; CAD53475.1; 



DR GO; GO: 0016020; C:membrane; IEA. 

DR GO; GO: 0005215; F: transporter activity; IEA. 

DR GO; GO:0006810; P:transport; IEA. 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR PROSITE; PS50283; NA_SOLUT_SYMP__3 ; 1. 

FT NONJTER 1 1 

FT NON_TER 377 377 

SQ SEQUENCE 377 AA; 41070 MW; 995293969378F8E7 CRC64; 

Query Match 56.5%; Score 1679; DB 13; Length 377; 

Best Local Similarity 85.4%; Pred. No. 1.9e-112; 

Matches 322; Conservative 28; Mismatches 27; Indels 0; Gaps 0; 

Qy 144 AI FSALGATI SVI I DVDMHI SVI I SALI ATLYTLVGGLYSVAYTDWQLFCI FVGLWI SV 203 

I I I I I I I I I II I I I ::::: I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I M : I II I I I 
Db 1 AI FSALGATI SVITDINVNLSVI I SALIATLYTLVGGLYSVAYTDWQLFCI FLGLWI SV 60 

Qy 204 P FAX S H PAVADI GFTAVHAK YQKPWLGT VD S S EVYSWLD S FLLLMLGGI PWQAYFQRVL S 263 

11111:111 I I I I I I I I : I Mill: I : I : I I I : I I I I I i I I I I I I I I I I I I 
Db 61 PFALSNPAVTDIGFTAVHEVHQAPWLGTIGSLNIYTWLDNFLLLTFGGIPWQAYFQRVLS 120 

Qy 264 SSSATYAQVLSFLAAFGCLVMAI PAILIGAIGASTDWNQTAYGLPDPKTTEEADMILPIV 323 

I I II I I I I I I I I I I I I I I : I I I I I I : I I I I I I I I I I I I I I I : I I I : I I I I I I I I I 
Db 121 S S SAT YAQ VL S FLAAFGC I VMAI P AVL I GAI GAS T AWNQT E YGVP D P I AN KEADMI L P I V 180 

Qy 324 LQ YLC PVY I S FFGL GAVS AAVMS SAD SSI LS AS SMFARN I YQL S FRQN AS D KE I VWVMRI 383 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I M I I I I I I I I I I I : I I I I I I I I 
Db 181 LQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFARN I YQL S FRQNASDRE I VWVMRI 240 

Qy 384 TVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQLLCVLFVKGTNTYGAVAGYVSGL 4 43 

I || : | | | | | I I I I I I : I I I I I I I I I I I I I I : I I I I I I I I I I : M I I I I I I : I I I : I I 
Db 241 TVFLFGASATAMALLAS SVYGLWYLS S DLVYI 1 1 FPQLLCVLFI KGTNT YGAI AGYLFGL 300 

Qy 444 FLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLAKYLFE 503 

I I II I I I II I I I II I I : I I I I I I : I M I : I I I I I I II : I I I III : I I I I I I I I 
Db 301 VLRITGGEPYLYLQPLIYYPGCYPDENNIYVQRFPFKTLAMLTSFFTNIIVSYLAKYLFG 360 

Qy 504 SGTLPPKLDVFDAWAR 52 0 

I I I I II I II I I I I I I 
Db 3 61 SGTLPPKLDFLDAWAR 377 



RESULT 8 
Q9VE4 6 

ID Q9VE46 PRELIMINARY; PRT; 614 AA. 

AC Q9VE4 6; Q961W3; 

DT 01-MAY-2000 (TrEMBLrel. 13, Created) 

DT 01-OCT-2002 (TrEMBLrel. 22, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE CG7708 protein (GH02984p) . 

GN CG7708. 

OS Drosophila melanogaster (Fruit fly) . 

OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; 

OC Neoptera; Endopterygota ; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophilidae ; Drosophila. 

OX NCBI TaxID=7227; 



RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Berkeley; 

RX MEDLINE=20196006; PubMed=10731132 ; 

RA Adams M.D., Celniker S.E., Holt R.A. , Evans C.A. , Gocayne J.D., 

RA Amanatides P.G., Scherer S.E., Li P.W., Hoskins R.A. f Galle R.F., 

RA George R.A., Lewis S.E., Richards S., Ashburner M. , Henderson S.N., 

RA Sutton G.G., Wortman J.R., Yandell M.D., Zhang Q. , Chen L.X., 

RA Brandon R.C., Rogers Y.-H.C, Blazej R.G., Champe M. , Pfeiffer B.D., 

RA Wan K.H., Doyle C, Baxter E.G., Helt G. , Nelson C.R., Miklos G.L.G., 

RA Abril J.F., Agbayani A., An H.-J., Andrews-Pf annkoch C. , Baldwin D., 

RA Ballew R.M., Basu A., Baxendale J . , Bayraktaroglu L. , Beasley E.M., 

RA Beeson K.Y., Benos P.V., Berman B.P., Bhandari D. , Bolshakov S., 

RA Borkova D., Botchan M.R., Bouck J., Brokstein P., Brottier P., 

RA Burtis K.C., Busam D.A. , Butler H. , Cadieu E., Center A., Chandra I., 

RA Cherry J.M., Cawley S., Dahlke C, Davenport L.B., Davies P., 

RA de Pablos B., Delcher A., Deng Z., Mays A.D., Dew I., Dietz S.M., 

RA Dodson K. , Doup L . E . , Downes M. , Dugan-Rocha S., Dunkov B.C., Dunn P . , 

RA Durbin K.J., Evangelista C.C., Ferraz C, Ferriera S., Fleischmann W., 

RA Fosler C, Gabrielian A.E., Garg N.S., Gelbart W.M., Glasser K., 

RA Glodek A., Gong F., Gorrell J.H., Gu Z., Guan P., Harris M., 

RA Harris N.L., Harvey D., Heiman T.J., Hernandez J.R., Houck J., 

RA Hostin D. , Houston K.A., Howland T.J., Wei M.-H., Ibegwam C, 

RA Jalali M. , Kalush F. , Karpen G.H., Ke Z., Kennison J. A., Ketchum K.A. , 

RA Kimmel B.E., Kodira CD., Kraft C, Kravitz S., Kulp D., Lai Z., 

RA Lasko P., Lei Y. f Levitsky A. A. , Li J., Li Z., Liang Y-, Lin X., 

RA Liu X., Mattei B., Mcintosh T.C., McLeod M.P., McPherson D. , 

RA Merkulov G. f Milshina N.V., Mobarry C. f Morris J., Moshrefi A., 

RA Mount S.M., Moy M. , Murphy B. f Murphy L. , Muzny D.M., Nelson D.L., 

RA Nelson D.R., Nelson K.A. , Nixon K., Nusskern D.R., Pacleb J.M., 

RA Palazzolo M. , Pittman G.S., Pan S., Pollard J., Puri V., Reese M.G., 

RA Reinert K., Remington K., Saunders R.D.C., Scheeler F. , Shen H., 

RA Shue B.C., Siden-Kiamos I., Simpson M. , Skupski M.P., Smith T. f 

RA Spier E . , Spradling A.C., Stapleton M. , Strong R. , Sun E . , 

RA Svirskas R. , Tector C, Turner R. , Venter E., Wang A.H., Wang X., 

RA Wang Z.-Y., Wassarman D.A., Weinstock G.M., Weissenbach J., 

RA Williams S.M., Woodage T. , Worley K.C., Wu D., Yang S w Yao Q.A. , 

RA Ye J., Yeh R.-F., Zaveri J.S., Zhan M. , Zhang G., Zhao Q., Zheng L-, 

RA Zheng X.H., Zhong F.N., Zhong W., Zhou X., Zhu S., Zhu X., Smith H.O., 

RA Gibbs R.A., Myers E.W., Rubin G.M., Venter J.C.; 

RT "The genome sequence of Drosophila melanogaster . " ; 

RL Science 287:2185-2195(2000). 

RN [2] 

RP SEQUENCE FROM N.A. 

RA Celniker S.E., Adams M.D., Kronmiller B. , Wan K.H., Holt R.A., 

RA Evans C.A., Gocayne J.D., Amanatides P.G., Brandon R.C., Rogers Y., 

RA Banzon J., An H., Baldwin D., Banzon J., Beeson K.Y., Busam D.A., 

RA Carlson J.W., Center A., Champe M. , Davenport L.B., Dietz S.M., 

RA Dodson K., Dorsett V., Doup L.E., Doyle C, Dresnek D., Farfan D., 

RA Ferriera S., Frise E. , Galle R.F., Garg N.S., George R.A. , 

RA Gonzalez M. , Houck J. , Hoskins R.A., Hostin D., Howland T.J., 

RA Ibegwam C, Jalali M. , Kruse D., Li P., Mattei B., Moshrefi A., 

RA Mcintosh T.C., Moy M. , Murphy B . , Nelson C, Nelson K.A. , Nunoo J., 

RA Pacleb J., Paragas V., Park S., Patel S., Pfeiffer B., 

RA Phouanenavong S., Pittman G.S., Puri V., Richards S., Scheeler F. , 

RA Stapleton M. , Strong R., Svirskas R. , Tector C, Tyler D., 

RA Williams S.M., Zaveri J.S., Smith H.O., Venter J.C., Rubin G.M.; 



RT "Sequencing of Drosophila melanogaster genome."; 

RL Submitted (MAR-2000) to the EMBL/GenBank/DDB J databases. 

RN [3] 

RP SEQUENCE FROM N.A. 

RA Misra S., Crosby M.A. , Matthews B.B., Bayraktaroglu L. , Campbell K., 

RA Hradecky P . , Huang Y., Kaminker J.S., Prochnik S.E., Smith CD., 

RA Tupy J.L., Bergman C, Berman B. , Carlson J.W., Celniker S.E., 

RA Clamp M. , Drysdale R. , Emmert D. , Frise E., de Grey A., Harris N., 

RA Kronmiller B., Marshall B., Millburn G . , Richter J., Russo S., 

RA Searle S.M.J. , Smith E., Shu S., Smutniak F. , Whitfield E. , 

RA Ashburner M. , Gelbart W.M., Rubin G.M., Mungall C.J., Lewis S.E.; 

RT "Annotation of Drosophila melanogaster genome."; 

RL Submitted (MAR-2 000) to the EMBL/GenBank/DDB J databases. 

RN [4] 

RP SEQUENCE FROM N.A. 

RA Adams M.D., Celniker S.E., Gibbs R.A. , Rubin G.M. , Venter C.J.; 

RL Submitted (MAR-2000) to the EMBL/ GenBank/ DDB J databases. 

RN [5] 

RP SEQUENCE FROM N.A. 

RA FlyBase; 

RL Submitted (SEP-2002) to the EMBL/GenBank/DDB J databases. 

RN [6] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Berkeley; 

RA Stapleton M. , Brokstein P., Hong L. , Agbayani A. , Carlson J., 

RA Champe M. , Chavez C, Dorsett V., Farfan D., Frise E., George R. , 

RA Gonzalez M. , Guarin H., Li P . , Liao G. , Miranda A., Mungall C.J., 

RA Nunoo J., Pacleb J., Paragas V. , Park S., Phouanenavong S., Wan K. , 

RA Yu C, Lewis S.E., Rubin G.M., Celniker S.; 

RL Submitted (JUL-2001) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; AE003723; AAF55583.2; -. 

DR EMBL; AY047521; AAK77253.1; -. 

DR FlyBase; FBgn0038641; CG7708. 

DR GO; GO: 0016020; Crmembrane; IEA. 

DR GO; GO: 0005215; F : transporter activity; IEA. 

DR GO; GO: 0006810; P: transport; IEA. 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR PROSITE; PS50283; NA_SOLUT_SYMP_3 ; 1. 

SQ SEQUENCE 614 AA; 66893 MW; 71A77E1216360042 CRC64; 

Query Match 52.4%; Score 1557.5; DB 5; Length 614; 

Best Local Similarity 58.5%; Pred. No. 1.8e-103; 

Matches 302; Conservative 84; Mismatches 115; Indels 15; Gaps 6 

Qy 4 HVEGLIAIIVFYLLILLVGIWAAWRTKNSG37.EERSEAIIVGGRDIGLLVGGFTMTATWV 63 

: : I : : : I : : I I I I I I : I I I I I MM: I I ::: I I I I I I I I I I I I I I I 
Db 3 NIAGWSIVLFYLLILWGIWAG-RKKQSGNDSE--EEVMLAGRSIGLFVGIFTMTATWV 59 

Qy 64 GGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQIYG 12 3 

11111111111:1 I I I I I I I I : I 1 I : I I : I I I I I I : I I : I I I II I : I 

Db 60 GGGYINGTAEAIYTS — GLVWCQAPFGYALSLVFGGIFFANPMRKQGYITMLDPLQDSFG 117 

Qy 124 KRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALIATLYTLVGGLYS 183 

: I I I II I I : I I I I I : I I I I I : I I I I I : I I I I I : I I I I : I : I I I I I I I I I I 
Db 118 ERMGGLLFLPALCGEVFWAAGILAALGATLSVIIDMDHRTSVILSSCIAI FYTLFGGLYS 177 



Qy 184 VAYT D WQL FC I FVGLW I S VP FAL S H P AVAD I G FT AVHAK YQ K PWL GT VD S S E VY S WL D S 243 

I I I i I I : I I I I I I : I I I : : I I I I : I : • = I : I I : : : : : I 

Db 17 8 VAYTDVIQLFCIFIGLWMCIPFAWSNEHVGSL S DLEVDWI GHVEPKKHWLYI DY 231 

Qy 244 F L L LML G G I P WQA Y FQ RVL S S S SAT YAQ VL S F LAAFG C L VMAI P AI L I GAI GAS T DWN QT 303 

III: llllll t I I 1 I I I t :l ll:ll::ll ll::llll : :| 11:1 

Db 232 GLLLVFGGI PWQVYFQRVLS SKTAGRAQLLS YVAAAGCI LMAI P PVLI GAI AKATPWNET 291 

Qy 304 AYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFARNI 363 

I | | I : | | | | I : | | I I I I :: I I I I I I I I M I I I I M I I I : I I I : I I I I I I : 
Db 2 92 DYKGPYPLTVDETSMILPMVLQYLTPDFVSFFGLGAVSAAVMSSADSSVLSAASMFARNV 351 

Qy 364 YQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQLLC 423 

I : I I I I I I : M : I I I I :: I I I I I I I : : I I I I : I I I I I : : : I I I I I 
Db 352 YKLI FRQKAS EMEI IWVMRVAI I WGI LAT IMALT I PS I YGLWSMCSDLVYVI LFPQLLM 411 

Qy 424 VL-FVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTL 482 

I : I I I I I I : : : I : I : I : : I I I I I I II I I I I I I I I I : I : 

Db 412 WHFKKHCNTYGSLSAYIVALAI RLSGGEAILGLAPLIKYPGY DEETKEQMFPFRTM 4 68 

Qy 483 AMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAW 518 

I I : I : I I : I : I : I I I I I I I II II 
Db 469 AMLLSLVTLISVSWWTKMMFESGKLPPSYDYFRCW 504 



RESULT 9 
Q9GPB1 

ID Q9GPB1 PRELIMINARY; PRT; 579 AA. 

AC Q9GPB1; 

DT 01-MAR-2001 (TrEMBLrel. 16, Created) 

DT 01-MAR-2001 (TrEMBLrel. 16, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Choline cotransporter . 

OS Limulus polyphemus (Atlantic horseshoe crab) . 

OC Eukaryota; Metazoa; Arthropoda; Chelicerata; Merostomata; Xiphosura; 

OC Limulidae; Limulus. 

OX NCBI_TaxID=68 50; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=21261948; PubMed=l 1368 908 ; 

RA Wang Y., Cao Z., Newkirk R.F., Ivy M.T., Townsel J.G.; 

RT "Molecular cloning of a cDNA for a putative choline co-transporter 

RT from Limulus CNS."; 

RL Gene 268:123-131(2001). 

DR EMBL; AY011119; AAG41055.1; -. 

DR GO; GO: 0016020; C:membrane; IEA. 

DR GO; GO: 0005215; F: transporter activity; IEA. 

DR GO; GO: 0006810; P: transport; IEA. 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR PROSITE; PS50283; NA_SOLUT_SYMP_3 ; 1. 

SQ SEQUENCE 579 AA; 62937 MW; FE7 F2 9D4 FAF47F04 CRC64; 



Query Match 51.5%; Score 1530.5; DB 5; Length 579; 

Best Local Similarity 52.0%; Pred. No. 1.5e-101; 

Matches 305; Conservative 115; Mismatches 134; Indels 33; Gaps 11 



Qy 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

| | : : I : : : I : I i : : I I : I I I I I : I I : I : : I I : : I I : I I I I I I I I I 
Db 1 MAVN I L GWS I G I F YVI I L I VG I WAS - RKKKT SSGQSETEEI MLAGRN IGF L VGVLTMT A 59 

Qy 61 TWVGGGYINGTAEAVYVP GYGLAWAQAP I GYS L S LI LGGL FFAKPMRS KGYVTMLDP FQQ 120 

I I I M M I I I II I I : I II I III IhMI : I I : III I I I : I I I I I I I I I : 

Db 60 TWVGGGYINGTAEAMY--NNGLVWCQAPFGYALSLFIGGIVFAKKMRSQGYVTMLDPLQE 117 

Qy 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALI ATLYTLVGG 180 

: I : I I I I I I : II I I I : I I : I I I : I I I I I I I I I : : : I : I : I : I I II II 
Db 118 NFGSKMGGLLFLPALCGEIFWSAAILAALGATISVITELESSTSIIVSSSIAVFYTFFGG 177 

Qy 181 L YS VAYT DWQ L FC I FVGLW I S VP FAL S H PAVAD I G FT AVHAK YQ K P WL GT VD S S EVY S W 240 

I I I I I I I I : I I I I I I I I I :: I I : I I I I : : I I : I I : 

Db 178 FYSVAYTDVIQLFCI FFGLWLCI PFSFSHEAVGSLS SIDFLGSVKLSDAGIN 229 

Qy 241 LDS FLLLMLGGI PWQAYFQRVLS S S SAT YAQVLS FLAAFGCLVMAI PAI LI GAI GASTDW 300 

: I : I I I : I I I I I I I I I I I I I ::: I I I I I I I I I : I I I I I I I I I I I : I I 
Db 230 VD IWLLLIFGGI PWQVY FQ RVL S AKNVS N AQVL S YVAAVG CWMAI PAI L I GVI AKAT AW 289 

Qy 301 NQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 360 

I : I I I : II : :: I I : I I I I I : I I I I I I I I I I I M I I : I I I I I I II I : I : 
Db 290 NETALGM--PLTPNDTSLVLPLVLHYLTPTAVSFFGLGAVSAAVMSSSDSSILSASSLFS 347 

Qy 361 RNI YQLS FRQNAS DKEI VWVMRI TVFVFGAS ATAMALLTKTVYGLWYLS S DLVYI VI FPQ 420 

I I : I : I III I I : : I : I I I : I I : : I I MINI I : I I I I I II II I I : I : : : I I I 
Db 348 RNVYKLI FRQKAS EREWWVI RI S I LWGI LATAMALTVKSVYGLWYLSS DLI YVI LFPQ 407 

Qy 421 LLCVLFVKG-TNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPF 479 

I I I I : : I I II I : : : I : I II III I I : I : I I I : : : : I hill 
Db 408 LLCWHLKKYCNTYGSLSAYIVGFLLRALGGESILGLEPVIHYP-FFSETSG QRFPF 463 

Qy 480 KTLAMVTS FLTNI CI S YLAKYLFES GTLP PKLDVFDAWARHSEENMDKT I LVKNENI K- 538 

: I I : I : I : I : I I : I : : I I I I I II : I I : : I I : : I : : : 

Db 464 RTLSMLASLITLLAISGITKWIFEMNHLPAKLDIFRCVT — NIQEN IIKIQKLQG 516 

Qy 539 LDEL— ALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDN 57 8 

II: : : : : : : : I I I I : I : : I 
Db 517 GAMP VL D S I KKEI YQ KDMNN S FNTWN S GNAE L LT D S T Y S GK I KKNN 563 



RESULT 10 
002228 

ID 002228 PRELIMINARY; PRT; 576 AA. 

AC 002228; Q9NL58; 

DT 01- JUL- 19 97 (TrEMBLrel. 04, Created) 

DT 01-OCT-2001 (TrEMBLrel. 18, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE C48D1.3 protein (High-affinity choline transporter CHO-1). 

GN C48D1.3 OR CHO-1. 

OS Caenorhabditis elegans . 

OC Eukaryota; Metazoa; Nematoda; Chromadorea; Rhabditida; Rhabditoidea ; 

OC Rhabditidae; Peloderinae; Caenorhabditis. 

OX NCBI_TaxID=6239; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Burton J. ; 



RL Submitted (OCT-1996) to the EMBL/GenBank/DDB J databases. 

RN [2] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=9 9 069613 ; PubMed=98 51916; 

RA none ; 

RT "Genome sequence of the nematode C.elegans: A platform for 

RT investigating biology."; 

RL Science 282:2012-2018(1998). 

RN [3] 

RP SEQUENCE FROM N.A. 

RC STRAIN=N2 ; 

RX MEDLINE=20116099; PubMed=1064 9566 ; 

RA Okuda T . , Haga T., Kanai Y., Endou H., Ishihara T., Katsura I.; 

RT "Identification and characterization of the high-affinity choline 

RT transporter . " ; 

RL Nat. Neurosci. 3:120-125(2000). 

DR EMBL; Z81049; CAB02847.2; -. 

DR EMBL; AB030946; BAA90483.1; -. 

DR PIR; T20037; T20037. 

DR WormPep; C48D1.3; CE27109. 

DR GO; GO: 0016020; C:membrane; IEA. 

DR GO; GO: 0005215; F: transporter activity; IEA. 

DR GO; GO: 0006810; P: transport; IEA. 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR PROSITE; PS50283; NA_SOLUT_SYMP_3 ; 1. 

SQ SEQUENCE 576 AA; 62427 MW; FAB0 977 83582 8 8D9 CRC64; 

Query Match 48.9%; Score 1453; DB 5; Length 576; 

Best Local Similarity 50.5%; Pred. No. 5.4e-96; 

Matches 295; Conservative 95; Mismatches 150; Indels 44; Gaps 9 

Qy 7 GLIAIIVFYLLILLVGIWAAWRTKNSGSAEER SEAIIVGGRDIGLLVGGFTMTATW 62 

I : : II : I I : I I I : I I I I I : : I : I I : I : : : II : I I I I I I I I I I I I 

Db 6 G I VAI VF F YVL I L WG I WAG RK SKSSKELES EAGAAT E EVMLAGRN I GT L VG I FTMT ATW 65 

Qy 63 VGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQIY 122 

III I I I I I I I I : I II I I I : I I :: I I : : I I I III II : I I : I I t I I I I I 

Db 66 VGGAYINGTAEALY — NGGLLGCQAPVGYAI SLVMGGLLFAKKMREEGYITMLDPFQHKY 123 

Qy 123 GKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALIATLYTLVGGLY 182 

I : I : I I I : : : I I I : I I I I I I I I I II I I : I I I : : I I : I I : I I I I II III 
Db 124 GQRI GGLMYVPALLGET FWTAAI LSALGATLSVI LGI DMNAS VTLSACIAVFYT FTGGYY 183 

Qy 183 S VAYT DWQLF C I FVGLW I S VP FAL S H P AVAD I G FT AVHAK YQ K PWL GT VD S - S EVY S W L 241 

: I I I I I I I I i I I I I I I I I : II I : I II I I : I : I I : 

Db 184 AVAYTD VVQLFCI FVGLWVCVPAAMVHDGAKDI SRNA GDWIGEIGGFKETSLWI 237 

Qy 242 DSFLLLMLGGI P WQ AY FQ RVL S S S SAT YAQVL S FLAAFG C L VMAI P AI L I GAI GAS T D WN 301 

I I II : I I I I II I I I I II I I : I II I I I : I I I :: I I I I I I I I I : I I I 
Db 238 DCMLLLVFGGI PWQ VTFQRVLS S KTAHGAQT L S FVAGVGC I LMAI PPALI GAI ARNT DWR 297 

Qy 302 QTAYGLPDPKTTEEA DMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSA 355 

II : I I : : I :: I : I I I I I ::: I I I I I I I I I I I I I I I I I : I I I 

Db 298 MTDYSPWNNGTKVESIPPDKRNMWPLVFQYLTPRWVAFIGLGAVSAAVMSSADSSVLSA 357 



QY 



356 S SMFARNI YQLS FRQNAS DKEI VWVMRI T VFVFGAS ATAMALLT KTVYGLW YLS S DLVYI 415 



: I I I I I I : : I : I : I I : I I : : I I I I : I I I I I I : : : I I I I I I : I I I I : 
Db 358 ASMFAHNIWKLTIRPHASEKEVIIVMRIAIICVGIMATIMALTIQSI YGLWYLCADLVYV 417 

Qy 416 VIFPQLLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGI YNQ 475 

: : I I I I I I I : : : : I I I I : : I I I I I I I : II I I : I III : I : I 

Db 418 ILFPQLLCWYMPRSNTYGSLAGYAVGLVLRLIGGEPLVSLPAFFHYPMY TDGV — Q 472 

Qy 476 KFPFKTIAHVTSFLTNICISYLAKYLFESGTLPPKLDVFDAW ARHSEENMDKTILV 532 

I II : I I I : : I I : I : : I I : I I I I : I I II I I : I 

Db 473 YFPFRTTAMLSSMATIYIVSIQSEKLFKSGRLSPEWDVMGCWNIPIDHVPLPSDVSFAV 532 

Qy 533 KNENIKL DELALVKPRQSMTLSSTFTN 559 

: I : : I I I : I : I I : I 

Db 533 SSETLNMKAPNGT PAPVHPNQQPSDENTLLHPYSDQSYYSTNSN 57 6 



RESULT 11 
Q7UFM6 

ID Q7UFM6 PRELIMINARY; PRT; 484 AA. 

AC Q7UFM6; 

DT 01-OCT-2003 (TrEMBLrel. 25, Created) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE High affinity choline transporter. 

GN CHT1 OR RB8472. 

OS Rhodopirellula baltica. 

OC Bacteria; Planctomycetes ; Planctomycetacia; Planctomycetales ; 

OC Planctomycetaceae; Pirellula. 

OX NCBI_TaxI D=l 1 7 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=1; 

RX MEDLINE=22735913; PubMed=12835416; 

RA Gloeckner F.O., Kube M. , Bauer M. , Teeling H., Lombardot T., 

RA Ludwig W., Gade D., Beck A., Borzym K., Heitmann K., Rabus R., 

RA Schlesner H., Mann R., Reinhardt R. ; 

RT "Complete genome sequence of the marine planctomycete Pirellula sp . 

RT strain 1 . " ; 

RL Proc. Natl. Acad. Sci. U.S.A. 100:8298-8303(2003). 

DR EMBL; BX294147; CAD78656.1; 

KW Complete proteome. 

SQ SEQUENCE 484 AA; 52674 MW; 79AB0135F18FEBB2 CRC64; 

Query Match 14.2%; Score 422.5; DB 16; Length 484; 

Best Local Similarity 27.6%; Pred. No. 3.9e-22; 

Matches 141; Conservative 95; Mismatches 220; Indels 55; Gaps 14; 

Qy 7 GLIAIIVFYLLI-LLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGG 65 

I I I I : : I I i : : : I : I I I I : : : I I I : I : : I I I I 

Db 2 GLIAAVLAYLLLTIAIGLLAARRVGN AQ D FMVAG RS L P L YMN FACVFATW FG - 53 

Qy 66 GYINGTAEAVY VPGYGL-AWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQ 119 

II I I I I I I I : | : | : | I I I I : : | : I : : 

Db 54 AETVLSVSATFAGQGLRAIPGDPFGFSICLVLVALFFARAFYRMDLLTIGDFYR 107 

Qy 120 QI YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI IDVDMHISVIISALIAT 173 

: I I : : I : : I I I I : I I I III: : : : : : I I 



Db 



108 KRYGRS I EVLT SWI SAS YLGWAAAQLTALGLVI SVLGKGI GYETLTINNGI VI GFTI VA 167 



Qy 174 L YT L VG GL Y S VAYT DWQ L FC I FVGLW - 1 S VP FAL S H P AVAD I G FT AVHAK YQ K P W L GT V 232 

I I : : I I : : I I I I I : : I I I : I I : I I I : I : : : I : : : 
Db 168 FYTVMGGMWSVALTDMIQTFVIIIGLLWSVYMAHAAGGVSWIESARESGRLQVFPDWG 227 

Qy 233 DSSEVYSWLDSFLLLMLGGIPWQAYFQRVLSSSSATYAQVLSFLAA-FGCLVMAIPA-IL 290 

I : : : : II I I II I Mill: I : I II: : I I 

Db 228 QSGQWWIYIGGFLTAALGSIPQQDVFQRVTSAKDERTAMTGTLLGGMFYCMFAFVPMFIA 2 87 

Qy 291 I GAI GASTDWNQTAYGLPDPKTTEEADMI LPIVLQYLCPVYI S FFGLGAVSAAVMS SADS 350 

I : II : I II: I : : I I I : : I : : I : I 

Db 288 YAAWI DPDHLQQF NSDDLREVQRTLPHAVIQSTPFWVQTVFLGALVSAILSTASG 343 

Qy 351 S I LSAS SMFARNI YQLS FRQNAS DKEI VWVMRITVFVFGASATAMALLT- KTVYGLWYLS 4 09 

: : I : I I : I : : I I : I I : : : I I : : I I I I I I : I : I : : 
Db 344 TLLAPSSLIVENVIR-PFRSDLDDKNMLRWLRIVLLMFGALALHQALTSNNTMYEMIQQA 402 

Qy 410 SDLVYIVI FPQLLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLI FYPGYYPDD 4 69 

: : I I : I I I I : I I I I : : : I : I I 
Db 403 YSVPLVGALVPLAVGL YWKRATT RGAMAS I VS GVATWLA FEYMLPEFLIPS 453 

Qy 470 NGI YNQKFPFKTLAMVTSFLTNICISYLAKY 500 

: : : III : : I I I : 
Db 454 QLMGLAASFLAMVWSLLDKF 474 



RESULT 12 
Q8EXG7 

ID Q8EXG7 PRELIMINARY; PRT; 462 AA. 

AC Q8EXG7; 

DT 01-MAR-2003 (TrEMBLrel. 23, Created) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Probable sodium: solute symporter. 

GN LB245. 

OS Leptospira interrogans. 

OC Bacteria; Spirochaetes ; Spirochaetales ; Leptospiraceae ; Leptospira. 

OX NCBI_TaxID=173; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=56601 / Serogroup Icterohaemorrhagiae / Serovar lai; 

RA Ren S . ; 

RL Submitted (MAR-2002) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; AE011612; AAN51804.1; -. 

DR GO; GO: 0016020; C:membrane; IEA. 

DR GO; GO: 0005215; F : transporter activity; IEA. 

DR GO; . GO: 0006810; P:transport; IEA. 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR PROSITE; PS50283; NA_SOLUT_SYMP_3 ; 1. 

KW Complete proteome. 

SQ SEQUENCE 462 AA; 50487 MW; C9B0104 065514C68 CRC64; 

Query Match 13.6%; Score 405.5; DB 16; Length 462; 

Best Local Similarity 27.5%; Pred. No. 6.2e-21; 

Matches 134; Conservative 103; Mismatches 204; Indels 47; Gaps 16; 



Qy 8 LIAI-IVFYLL-ILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGG 65 

: : I I : : I I I : I : I I : : I : : : I I : I : : I I I I 

Db 1 MLAI S VI FYL FTT I L I GAVAS RFVS D SKDYVLAGRRLPLFLASSALFATWFGS 53 

Qy 66 GYINGTAEAVYVPGYGLAWAQAP I GYS LS LI LGGLFFAKPMRS KGYVTMLDP FQQI YGKR 125 

: I I : : I I : I I : I I I I I I I I : I : : I I : : : I : I 

Db 54 ETLLG-ASSRFVEDGILGVIEDPFGAALCLFLVGLFFARPLYRMNILTFGDFYKNRFGRR 112 

Qy 126 MGGLLFI PALMGEMFWAAAI FSALGATI SVI IDVDMHI SVI I SALIATLYTLVGGLY 182 

: : II: I I I I I I I I : I : : : I I : : I I : I I : : 

Db 113 AEILSSVFMIPSYFG WIAAQFVALGI I FHSLADI PVSTGI IAGAGWLI YTVTGGMW 169 

Qy 183 S VAYT DWQL FC I FVGLWI S VP FAL S H P AVAD I GFTAVHAK YQ KP WLGTVDSSEVY 238 

: : : I I : I I : I I I : I I I I : I II : : : : : : 

Db 170 AISLTDFLQTVLIVLGLSYLV-WDLSSKAG GIEKILAS-TKPGFFRFFPEMNAKSI F 224 

Qy 239 SWLDSFLL LML GG I P WQ A Y FQ RVL S S S SAT YAQ VL S FLAAFGC LVMA- 1 P AI L I GAI GAS 297 

::::::: I I I I I I I I I : : I I I I I :| I : I : I II : I 
Db 225 AYI AAWMTI GLGS I PQQDI FQRVMAS KS EKVAVYS S LLGS FFYLS VAFLP — L I AVLCAR 2 82 

Qy 2 98 TDWNQTAYGLPDPKTTEEADMI LPI VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS S 357 

: : I I : I I I I I : : : I I : : I I I I : I : I I : : : I 

Db 283 KIYPEIA KEDAQMILPKTVLTHTGLFTQILFFGALLSAVMSTASGAI LAS AS 334 

Qy 358 MFARNIYQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVI 417 

: |: : |:: I:: :: : |::| : :|| : I I :| I : : 

Db 335 VLGENVIRPFFKK-TSERTLLRLFRLSVIAITLVSLSMANTKSNIYELVSQASALSLVSL 393 

Qy 418 FPQLLCVLFVKGTNTYGAVAGYVSG LFLRITGGEPYLYLQPLIFYPGYYPD 4 68 

I I : I I I : : I I : : I III II : I I : : 

Db 394 FI PLVAGLFRKNSTSTGAI FSMI VGFCTWFLCNILSLEI PASI PGLI S SWI ALYLGDWME 453 

Qy 469 DNGIYNQK 476 

I I I I 

Db 454 HRG-YIQK 4 60 



RESULT 13 
Q8Y273 



ID Q8Y273 PRELIMINARY; PRT; 479 AA. 

AC Q8Y273; 

DT 01-MAR-2002 (TrEMBLrel. 20, Created) 

DT 01-MAR-2002 (TrEMBLrel . 20, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Probable sodium/solute symporter transmembrane protein. 

GN RSC0463 OR RS04434. 

OS Ralstonia solanacearum (Pseudomonas solanacearum) . 

OC Bacteria; Proteobacteria; Betaproteobacteria; Burkholderiales ; 

OC Burkholderiaceae; Ralstonia. 

OX NCBI_TaxID=305; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=GMI1000; 

RX MEDLINE=21681879; PubMed=l 18238 52 ; 

RA Salanoubat M. , Genin S., Artiguenave F., Gouzy J., Mangenot S., 

RA Arlat M. , Billault A., Brottier P., Camus J.C.. Cattolico L. , 



RA Chandler M., Choisne N., Claudel-Renard C, Cunnac S., Demange N., 

RA Gaspin C, Lavie M. , Moisan A., Robert C, Saurin W., Schiex T., 

RA Siguier P., Thebault P., Whalen M. , Wincker P., Levy M. , 

RA Weissenbach J. , Boucher C.A.; 

RT "Genome sequence of the plant pathogen Ralstonia solanacearum. "; 

RL Nature 415:497-502(2002). 

DR EMBL; AL646059; CAD13991.1; 

DR GO; GO: 0016020; C:membrane; IEA. 

DR GO; GO: 0005215; F : transporter activity; IEA. 

DR GO; GO: 0006810; P: transport; IEA. 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR PROSITE; PS50283; NA_SOLUT_SYMP_3 ; 1. 

KW Complete proteome. 

SQ SEQUENCE 479 AA; 52091 MW; 560962E4 11DBC9B8 CRC64; 



Query Match 12.8%; Score 381.5; DB 16; Length 479; 

Best Local Similarity 28.2%; Pred. No. 3.4e-19; 

Matches 128; Conservative 85; Mismatches 188; Indels 53; Gaps 16 

Qy 11 IIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGYING 70 

: I I :::::: I : I I I I : I : lit: I I : I I I I : I 

Db 6 VI VYWVI S VGI GLWAAIjRVRNTAD FAVAGRGLPFYWTATVFATWFGSETVLG 58 

Qy 71 TAEAWVPGYGLAWAQA-PIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQIYGKRMGGL 129 

II:: II I I I I I I I I I I I I I : I : : I : I : : : I : I 

Db 59 -IPAVFLK-EGLHGWADPFGSSLCLILVGLFFARPLYRMNLLTIGDFYRNRFGRVAEVL 116 

Qy 130 L FI PALMGEMFWAAAI FS ALGAT I S VI I D — VDMHISVII SALIATLYTLVGGLYSVAYT 187 

: : : : I I I III : I : : : I I : I I I I I : : I I I I 

Db 117 TTLCIWSYLGWVAAQIKALGLVFYTVSDGGLSQQTGMMIGAASVLVYTLFGGMWSVAVT 176 

Qy 188 DWQ L FC I FVG L W I S VP FAL S H P AVAD I G FT AVHA KYQKPWLGTVDSSEVYSWLDS 243 

I : I : I : I : : : : : I I : II |: : : II :: : 

Db 177 DFIQMI 1 1 VI GM-MYI GWEVS GQA-GGVATWAHASAAGKFS — FWPAFNP I EVI GFVTA 232 

Qy 244 FL LLMLGG I PWQAYFQRVL SS S S AT YAQVLS FLAAFGCLVMAI PAI LI GAI GA 296 

: : : I I I I I I I I I I I I : : : I I II II : : I III 
Db 233 WITMMLGSIPQQDVFQRVTSSRTERIAGTASVLGGVLYFLFAFIPMFLAYSATLI 287 

Qy 297 STDWNQTAYGLPDPK TTEEADMILP-IVLQYLCPVYISFFGLGAVSAAVMSSADS 350 

II: : : : | | | : M : : | : . I I : : I : I I : 

Db 288 DPQMVARYINTDSQLILPKLVLEH-APLVAQVMFFGALLSAIKSCASA 334 

Qy 351 SILSASSMFARNIYQLSFRQNASDKEIVWMRITVFVFGASATAMALLTK-TVYGLWYLS 409 

: : I : I III:: II : I I : I I I I I I I : : : : : : 

Db 335 T LLAP S VT FAEN VL R~ PML P RMDDKRFL RVMQAWLVFT ALVT L FALN S H L S I FHMVENA 393 

Qy 410 SDLVYIVIFPQLLCVLFVKGTNTYGAVAGYVSGL 443 

: : I I III I : II 

Db 394 YKVTLVAAFVPLAFGLFWKRATRQGGLLAIALGL 427 



RESULT 14 
Q9V2P3 

ID Q9V2P3 PRELIMINARY; PRT; 492 AA. 

AC Q9V2P3; 



DT 01-MAY-2000 (TrEMBLrel . 13, Created) 

DT 01-MAY-2000 (TrEMBLrel. 13, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Proline symporter (Proline permease) . 

GN PUTP-3 OR PYRAB00320 OR PAB2354. 

OS Pyrococcus abyssi. 

OC Archaea; Euryarchaeota ; Thermococci; Thermococcales ; Thermococcaceae; 

OC Pyrococcus . 

OX NCBI_TaxID=29292; 

RN [1] 

RP SEQUENCE FROM N . A. 

RC STRAIN=GE5 / Orsay; 

RA Heilig R. ; 

RT "Pyrococcus abyssi genome sequence: insights into archaeal chromosome 

RT structure and evolution."; 

RL Submitted (JUL-1999) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; AJ248283; CAB48955.1; 

DR PIR; D75188; D75188. 

DR GO; GO: 0016020; C:membrane; IEA. 

DR GO; GO: 0005215; F: transporter activity; IEA. 

DR GO; GO: 0006810; P: transport; IEA. 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR TIGRFAMs; TIGR00813; sss; 1. 

DR PROSITE; PS50283; NA_SOLUT_S YMP_3 ; 1. 

KW Complete proteome. 

SQ SEQUENCE 492 AA; 53457 MW; A7C72B1AF2 92 82B3 CRC64; 

Query Match 11.6%; Score 344; DB 17; Length 492; 

Best Local Similarity 24.2%; Pred. No. 1.7e-16; 

Matches 132; Conservative 99; Mismatches 196; Indels 118; Gaps 25 

Qy 8 LIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGY 67 

I : I : : I : I I I : I III: I MM: : : : : 

Db 14 LVAFLFTLILPILVGFYAMKRTKS EEDFFVGGRAMDKITVALSAVSSGRSSWL 66 

Qy 68 I N GT AEAVYVP G YGLAWAQAP I G Y S L S LI LGGLFFAKPMRS KGYVTMLDPFQQI YG 123 

: I : II | | : | | : : : | : | : | : | | : : 

Db 67 VLGLSGMAYKMGVTAVW — AAVGYI VAEMFQFVYMGI RLRKFS ERFNAITVPDYFEARFR 124 

Qy 124 K RMGG LLFI PALMGEMFWAAAI FSALGATI S VI I DVDMHI S VI I SALI ATL 174 

I: ::|: : :| III |:| : : : :::|| |: : 

Db 125 DTSKILRIAASIIIIIFLTSYVGAQFNAGA KTLSTALGI SI FTALMI SVLMI IV 178 

Qy 175 YTLVGGLYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFT AVHAKYQK 226 

I : : I I : I I I I I : : : : I I : j I I I : I I I : I 

Db 179 YMILGGFIAVAYNDVIRAVIMIIGLW LPVIAVAKVGGTEEVLKVLHALDPKLIN 233 

Qy 227 PW LGTVDS S EVYSWLDS FLLLMLG- GI PWQAY- FQRVLS S S SAT YAQVLS FLAAFGC 281 

II II I : I I I I : I : I : I : : I 

Db 234 PWAFGAGWIG FLGIGFGSPGQPHI IVRYMSIDDPNKLRVSTWGTFWN 282 

Qy 2 82 LVMAI PAI LI GAIGASTDWNQTAYGLPDPKTT- -EEADMI LP- IVLQYLCPVYI S FFGLG 338 

: I : I I I : I I : : I I : I : I I I : I I I : : I 

Db 2 83 WLAWGAI FVGLAGRAI VPDVSQLPGKNAEMI YPYLSAQYFPPILYGIL-IG 333 



Qy 



339 AVS AAVMS S ADS S I LS AS SMFARN I YQLS FRQNA- - S DKE I VWVMRI TVFVFGAS ATAMA 396 



: M::|:||| :| :| :: : I : : I : I I I I I : I 

Db 334 GIFAAILSTADSQLLWASTWKDLYQEVIKKGTKIDEKTALTISRVTVLWGFLAAILA 393 

Qy 397 LLTKTVYGLWYLSSDLVY-IVIF PQLLCVLFVKGTNTYGAVAGYVSGLFL 445 

I : : I : : : I : I I I : hill : I : I I : I 

Db 394 YVAKDIIFWFVLFAWGGLGASFGPTLILSLYWKGTTKWGVLAGMIVGTIT 443 

Qy 446 RITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLAKYLFESG 505 

I ll|:|: hi : I I : I : I :| : I 
Db 444 TIVW KLYLKPI TGLY-ELVP AFI FSLIATI IVSMITK 47 9 

Qy 506 TLPPK 510 

I I : 

Db 480 — PPE 482 



RESULT 15 
Q81Y52 

ID Q81Y52 PRELIMINARY; PRT; 4 92 AA. 

AC Q81Y52; 

DT 01-JUN-2003 (TrEMBLrel . 24, Created) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Sodium/proline symporter family protein. 

GN BA37 05. 

OS Bacillus anthracis (strain Ames). 

OC Bacteria; Firmicutes; Bacillales; Bacillaceae; Bacillus. 

OX NCBI_TaxID=198094; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=22608414; PubMed=1272 1 62 9 ; 

RA Read T.D., Peterson S.N., Tourasse N. , Baillie L.W., Paulsen I.T., 

RA Nelson K.E., Tettelin H., Fouts D.E., Eisen J. A. , Gill S.R., 

RA Holtzapple E.K., Okstad O.A., Helgason E . , Rilstone J., Wu M. , 

RA Kolonay J.F., Beanan M.J., Dodson R.J., Brinkac L.M., Gwinn M. , 

RA DeBoy R.T., Madpu R. , Daugherty S.C., Durkin A.S., Haft D.H., 

RA Nelson W.C., Peterson J.D., Pop M. , Khouri H.M., Radune D., 

RA Benton J.L., Mahamoud Y. , Jiang L . , Hance I.R., Weidman J.F., 

RA Berry K.J., Plaut R.D., Wolf A.M., Watkins K.L., Nierman W.C., 

RA Hazen A., Cline R. , Redmond C, Thwaite J.E., White O., Salzberg S.L., 

RA Thomason B . , Friedlander A.M. , Koehler T.M., Hanna P.C., Kolsto A.-B., 

RA Fraser CM. ; 

RT "The genome sequence of Bacillus anthracis Ames and comparison to 

RT closely related bacteria."; 

RL Nature 423:81-86(2003). 

DR EMBL; AE017035; AAP27454.1; -. 

DR TIGR; BA3705; 

DR GO; GO: 001602 0; C: membrane; IEA. 

DR GO; GO: 0005215; F: transporter activity; IEA. 

DR GO; GO:0006810; P:transport; IEA. 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR TIGRFAMs; TIGR00813; sss; 1. 

DR PROSITE; PS50283; NA_SOLUT_SYMP_3 ; 1. 

KW Complete proteome. 

SQ SEQUENCE 492 AA; 53891 MW; E2377D735C1A90F9 CRC64; 



Query Match 11.2%; Score 334; DB 16; Length 492; 

Best Local Similarity 22.9%; Pred. No. 9e-16; 

Matches 129; Conservative 101; Mismatches 219; Indels 114; Gaps 17; 

Qy 5 VEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVG 64 

:| :::: :: :| :| |: :| : ::||| :| | : |: : 

Db 3 IEIMVS LAI YMAGMLYIGYWSYKKTSDLSD YMLGGRGLGPAVTAL SAGAS DMS 55 

Qy 65 GGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGG LFFAKPMR SKGYVTML 115 

I : I |:| I I :: |::| I I :| : :|: 

Db 56 GWMLMGLPGAMYATGLSSVW IAIGLLIGAYANYLILAPRLRTYTEVANDSITI P 109 

Qy 116 DPFQQIYGKRMGGLLFIPA LMGEMFWAAAI FS ALGAT I S VI I DVDMH I S VI I SAL I A 172 

I : : I I I : I I : I : I : I : | : : | I : : : : 

Db 110 DFLENRFKDRTKILRFVSAIVILVFFTFYASAGLVSGGRLFENSFNLDYKIGLFVTVGW 169 

Qy 173 TLYTLVGGLYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIG FTAVHAKYQKP 227 

I I I I I : I : : I I I I : I : I : I I I I : I I : 
Db 170 VAYTLFGGFLAVSWTDFVQGCIMFIAL-VLVPIV AFTDVGGVTETFNTIK 218 

Qy 228 WLGTVDSSEVYSWLDSFLLLMLGGIPW-QAYF QRVL S S S S AT YAQVL S FLAAFG 280 

I I : I : : : : I : : : t II I : : : I : : 

Db 219 QVDASHLDMFKGTTILGIISFLAWGLGYFGQPHIIVRFMAITSIKDLKTSRRIGIGW 275 

Qy 281 CLVMAI PAILIGAIGASTDWNQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAV 340 

: I I : I I : I II : I : : : I : I I I : I I I : 

Db 276 MT I S 1 1 GAMLT GLVG IAYYAKNNATLQDPEMVFVTFSNILFHPYITGFLLSAI 328 

Qy 341 SAAVMSSADSSILSASSMFARNIYQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTK 400 

I : : I I I I : I II : I : II : I I I I I : I : : I : : I I I : I 
Db 329 LAS IMS S I S SQLLVI S SAVTEDFYKT FFRRKASDKELVFI GRLSVLWAMI AWLA 384 

Qy 4 01 TVYGLWYLSSDLVYIVI FPQLLCVLFVKGTNTYGAVAGYVSGLFLRITG 44 9 

I II: : : I - I I : I M : I : I I : I : I I 

Db 385 YHPSDTILTLVGYAWAGFGSAFGPAILLSLYWKRTNKWGVLAGMIVGALWITW 438 

Qy 450 GE-PYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLAKYLFESGTLP 508 

: I I II:: I I I : I : I 
Db 439 VQ I P S L KAS MY EMVP G F F CSLLAVIIVSLVTK 47 0 

Qy 509 P K L DVF D AWARH S E ENMD KT I L 531 

: I I I I I : : I 
Db 471 — E P VKAI H RE FN EMEAVL 4 87 
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SUMMARIES 



Result 



Query 



No. 


Score 


Match Length 


DB 


ID 


Description 


1 


308.5 


10.4 


662 


1 


SL51_RABIT 


P11170 


oryctolagus 


2 


308 


10.4 


670 


1 


SL52 RAT 


P53792 


rattus norv 


3 


306 


10.3 


664 


1 


SL51 HUMAN 


P13866 


homo sapien 


4 


306 


10.3 


665 


1 
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RESULT 1 
SL51_RABIT 

ID SL51_RABIT STANDARD; PRT; 662 AA. 

AC P11170; 

DT 01-JUL-1989 (Rel. ll f Created) 

DT 01-JUL-1989 (Rel. 11, Last sequence update) 

DT 15-JUL-1998 (Rel. 36, Last annotation update) 

DE Sodium/glucose cotransporter 1 (Na (+) /glucose cotransporter 1) 

DE (High affinity sodium-glucose cotransporter) . 

GN SLC5A1 OR SGLT1. 

OS Oryctolagus cuniculus (Rabbit) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Lagomorpha; Leporidae; Oryctolagus. 

OX NCBI_TaxID=998 6; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=New Zealand white; 

RX MEDLINE=8 806585 6; PubMed=244 613 6 ; 

RA Hediger M.A. , Coady M.J./ Ikeda T.S., Wright E.M.; 

RT "Expression cloning and cDNA sequencing of the Na+/glucose co- 

RT transporter."; 

RL Nature 330:37 9-381(1987). 

RN [2] 

RP SEQUENCE FROM N.A. 



RC STRAIN=New Zealand white; TISSUE= Kidney cortex; 

RX MEDLINE=91223090; PubMed=2 02 5641 ; 

RA Morrison A.I., Panayotova-Heiermann M. , Feigl G., Schoelermann B., 

RA Kinne R.K.H. ; 

RT "Sequence comparison of the sodium-D-glucose cotransport systems in 

RT rabbit renal and intestinal epithelia . " ; 

RL Biochim. Biophys. Acta 108 9:121-123(1991). 

CC -!- FUNCTION: Actively transports glucose into cells by Na( + ) co- 

CC transport with a Na(+) to glucose coupling ratio of 2:1. 

CC -!- FUNCTION: Efficient substrate transport in mammalian kidney is 

CC provided by the concerted action of a low affinity high capacity 

CC and a high affinity low capacity Na (+) /glucose cotransporter 

CC arranged in series along, kidney proximal tubules. 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein. 

CC -!- TISSUE SPECIFICITY: Found predominantly in intestine and in outer 
CC renal medulla. 

CC -!- DISEASE: Mutation of Asp-28 is implicated in glucose/galactose 
CC malabsorption. 

CC -!- SIMILARITY: BELONGS TO THE SODIUM : SOLUTE SYMPORTER FAMILY (SSF). 

CC 7 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; X06419; CAA29727.1; -. 

DR EMBL; X55355; CAA39040.1; -. 

DR PIR; S00515; A37226. 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR TIGRFAMs; TIGR00813; sss; 1. 

DR PROSITE; PS00456; NA_S0LUT_SYMP_1 ; 1. 

DR PROSITE; PS00457; NA_SOLUT_SYMP_2 ; 1. 

DR PROSITE; PS50283; NA_S0LUT_S YMP_3 ; 1. 

KW Transport; Sugar transport; Transmembrane; Sodium transport; Symport; 
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(BY SIMILARITY) . 


SQ 


SEQUENCE 


662 AA; 


73079 


MW; 03F55A0309CBBE01 CRC64; 



Query Match 10.4%; Score 308.5; DB 1; Length 662; 

Best Local Similarity 23.4%; Pred. No. 2.6e-14; 

Matches 154; Conservative 110; Mismatches 238; Indels 155; Gaps 26; 

Qy 11 IIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGYING 7 0 

| : : : : I : : : I I : I I : I I I : : I I : | : : | : : I I : I 

Db 32 IVIYFLWMAVGLWAMFST-NRGTV GGFFLAGRSMVWWPIGASLFASNIGSGHFVG 8 6 

Qy 71 TAEAVYVPGYGLAWAQAPIGYS L S LI LGGLFFAKPMRS KG YVTMLDP FQQ I Y- GK 124 

I til I I : : : : I I : I : I : I I I I : I : : I I 

Db 87 LA GTGAASGIATGGFEWNALIMVWLGWVFVPIYIRA-GVVTMPEYLQKRFGGK 139 

Qy 125 RMGGLLFI PALMGEMFW- -AAAI FSALGAT- 1 SVI I DVDMHI SVI I SALI ATLYTLVGGL 181 

I : I I : I : : I : I I I I II I : : : I : : : : : I.I : I I II : I I I 
Db 140 RIQIYLSILSLLLYIFTKISADIFS — GAIFIQLTLGLDIYVAIIILLVITGLYTITGGL 197 

Qy 182 Y S VAYT DVVQ L FC I FVGLWI S VP FAL S H P AVAD I G FT AVHAK Y Q 225 

: I I I I : I : I I I II hi li I 

Db 198 AAVIYTDTLQTAIMMVGSVILTGFAFHEVG GYEAFTEKYMRAIPSQISYGNTSIPQ 253 

Qy 226 KPWLGTVDS SEVYSWLDS FLLLMLGGI PW QAY FQRVL S S S S A 2 67 

I : I : I : I I I I I I I I h : 

Db 254 KCYTPREDAFHI FRDAITGDIPWPGLVFGMSILTLWYWCTDQVIVQRCLSAKNL 307 

Qy 268 TYAQVLS FLAAFGCLVMAI PAI LI GAI GASTDWNQTAYGLPDP KTTEEADMILP 321 

: : : | : : : : : : I : : : I : I : : I 

Db 308 SHVKAGCILCGYLKVT^PMFLIViyiMGMVSRILYTDKVACWPSECERYCGTRVGCTNIAFP 367 



Qy 322 IVLQYLCPVYI S FFGLGAVSAAVMSSADSS I LSAS SMFARNI YQLSFRQNASDKEI WVM 381 

: : || : I : I : : I I I I I I h : I : I I I : I h I h : 

Db 368 TLVVELMPNGLRGLMLSV>1MASLMSSLTSIFNSASTLFTMDI Y-TKIRKKASEKELMIAG 426 

Qy 382 RI -TVFVFGAS ATAMALLTKTVYG — LWYLS S DLVYI — VI FPQLLCVLFVKGTNTYGAV 436 

| : : I : I I : : : I I : I I : I I : I I I I I 

Db 427 RLMLFLIGISIAWPIVQSAQSGQLFDYIQSITSYLGPPIAAVFLI^IFWKRVNEPGAF 486 

Qy 4 37 AGYVSGLFLRI --TG GEPYLYLQPLIFYPGYYPDDNGI Y 473 

I I I : I II I I I I : : I 
Db 487 WGLVLGFLIGISRMITEFAYGTGSCMEPSNCPTIICGVHYLYFAIILF 534 



Qy 474 NQKFPFKTLAMWSFLTNICISYLAKYLFESGTLPPKLDVFDAVVA-RHSEENMDKTILV 532 

I I : I : : I | : I : : : : h h I 
Db 535 VISIITWWSLFTKPI PDVHLYRLCWSLRNSKE 568 



Qy 



533 KNENI KLD — ELALVKPRQSMTLS STFTNKEAF 



LDVDSSPEGSGTED 577 



I I || I : : : I : hi || | |: : |: 

Db 569 --ERIDLDAGEEDIQEAPEEATDTEVPKKKKGFFRRAYDLFCGLDQDKGPKMTKEEE 623 

RESULT 2 
SL52_RAT 

ID SL52_RAT STANDARD; PRT; 67 0 AA. 

AC P53792; 

DT 01-OCT-1996 (Rel. 34, Created) 

DT 01-OCT-1996 (Rel. 34, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 

DE Sodium/glucose cotransporter 2 (Na (+) /glucose cotransporter 2) 

DE (Low affinity sodium-glucose cotransporter) . 

GN SLC5A2 OR SGLT2 . 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus. 

OX NCBI_TaxID=10116; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Sprague-Dawley; TISSUE=Kidney; 

RX MEDLINE=96094332; PubMed=749397 1 ; 

RA You G., Lee W.-S., Barros E.J.G., Kanai Y. , Huo T.-L., Khawaja S., 

RA Wells R.G., Nigam S.K., Hediger M.A. ; 

RT "Molecular characteristics of Na ( + ) -coupled glucose transporters in 

RT adult and embryonic rat kidney. 11 ; 

RL J. Biol. Chem. 270:29365-29371(1995). 

CC -!- FUNCTION: Sodium-dependent glucose transporter. Has a Na+ to 
CC glucose coupling ratio of 1:1. 

CC -!- FUNCTION: Efficient substrate transport in mammalian kidney is 

CC provided by the concerted action of a low affinity high capacity 

CC and a high affinity low capacity Na (+) /glucose cotransporter 

CC arranged in series along kidney proximal tubules. 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein. 

CC -!- TISSUE SPECIFICITY: Kidney, in proximal tubule Si segments. 

CC -!- DEVELOPMENTAL STAGE: Appears on embryonic day 17 and gradually 

CC increases until day 19. Decreases between day 19 and birth. 

CC -!- PTM: GLYCOSYLATED AT A SINGLE SITE. 

CC -!- SIMILARITY: BELONGS TO THE SODIUM: SOLUTE SYMPORTER FAMILY (SSF). 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC ■ 

DR EMBL; U29881; AAC52325.1; 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR TIGRFAMs; TIGR00813; sss; 1. 

DR PROSITE; PS00456; NA_SOLUT_SYMP_l ; 1. 

DR PROSITE; PS00457; NA_SOLUT_SYMP_2 ; 1. 

DR PROSITE; PS50283; NA_S0LUT_SYMP_3 ; 1. 

KW Transport; Sugar transport; Transmembrane; Sodium transport; Symport; 

KW Glycoprotein. 
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N-LINKED (GLCNAC. . .) (PROBABLE) 


SQ 


SEQUENCE 


67 0 AA; 


72961 


MW; 0609562861618BB3 CRC64; 



Query Match 10.4%; Score 308; DB 1; Length 670; 

Best Local Similarity 23.3%; Pred. No. 2.9e-14; 

Matches 148; Conservative 95; Mismatches 209; Indels 184; Gaps 28; 



Qy 8 LIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGY 67 

: : I : : | | : : | I : I : : I I I I : : I I : | : : I : : I I : 

Db 2 4 I LVI AAYFLLVI GVGLWSMFRT-NRGTV GGYFLAGRSMVWWPVGAS LFASNI GS GH 7 8 

Qy 68 INGTAEAVYVPGYGLAWAQAPIGYS LSLILGGLFFAKPMRSKGYVTMLDPFQQIY 122 

|| Ml II: : I : I I I I : : I : I I I 
Db 79 FVGLA GT GAAS GLAVAGFEWNAL FVVL LLGWL FVP VYL - T AGVI TM PQYL 127 

Qy 123 GKRMGG LLFI PALMGEMFWAAAI F — SALGAT I SVI I DVDMHI S VI I S 168 

| | | | I : I : : : I : I I I I I : I I I 

Db 128 RKRFGGRRIRLYLSVLSLFLYIFTKISVDMFSGAVFIQQALGWNI YASVIAL 179 

Qy 169 AL I AT L YT L VGGL Y S VAYT DWQ L FC I FVG LW I S VP FAL S H P AVAD I G FT AVHAK Y 224 

I : I I : I I I : : I I I I I I I I : I : I I : : : I I 

Db 180 L G I TMI YT VT GGLAALMYT DT VQT FVI LAGAF I LT G YAFH E VG GYSGLFDKYLGAV 235 

Qy 225 QKPWLGTVDSSEVYSWLDSFLLL MLGGIPW QAY 257 

: I : I : I : I I : I I : I I : I I I 

Db 236 TSLTVSKDPAVGNISSTCYQPRPDSYHLLRDPVTGGLPWPALLLGLTIVSGWHWCSDQVI 295 



Qy 258 FQRVLSSSSATYAQ VL S FLAAFGCLVMAI P AI L I GAI GAS T DWN Q T AY GL P D P KT T 313 

| | | : : I : : : : I : | : | : : : : I I I 

Db 296 VQRCLAGKNLTHIKAGCILCGYLKLMPMFLMVMPGMI SRILY--PD 339 

Qy 314 EEADMILPIVLQYLC PVYISFFGLGAVSAAVMSSADSSILS 354 

| :: | I : : I I : I : I I : I I I I I 

Db 340 -EVACVVPEVCKRVCGTEVGCSNIAYPRLVVKLMPNGLRGLMLAVi^LAAi 398 



Qy 355 AS SMFARN I YQL S FRQNAS DKEI VWVMRI TVFVFGAS ATAMALLTKT VYG LWYLSSD 411 

: I : : I : I I I I I : I : : I I : I I : I : : I I : I 

Db 399 SSTLFTMDIY-TRLRPRAGDRELLLVGRLWWFIVAVSVAWLPWQAAQGGQLFDYIQSV 457 

Qy 412 LVYIV — IFPQLLCVLFVKGTNTYGAVAGYVSGLFLRI TGG--EP 452 

I : : : I I I I II I : I | : : II I 

Db 458 SSYIAPPVSAVFVLALFVPRVNEKGAFWGLIGGLLMGLARLIPEFFFGTGSCVRPSACPA 517 

Qy 453 YLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLAKYLFESGT 506 

I I I : : I : MM:: | : I : : I 
Db 518 IFCRVHYLYFAIILFFCS GFLTLA- 1 S RCTAP I PQKHLHRLVFS 560 

Qy 507 LP PKLDVFDAWARHS EENMDKT I LVKNENI KLDEL 542 

I I I : I : I : : : I I 

Db 561 LRHSKE EREDLDAEEL 57 6 



RESULT 3 
SL51_HUMAN 

ID SL51_HUMAN STANDARD; PRT; 664 AA. 

AC P13866; 

DT 01-JAN-1990 (Rel. 13, Created) 

DT 01-JAN-1990 (Rel. 13, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 

DE Sodium/glucose cotransporter 1 (Na (+) /glucose cotransporter 1) 

DE (High affinity sodium-glucose cotransporter) . 

GN SLC5A1 OR SGLT1 . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9 606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=89345544; PubMed=249 0366; 

RA Hediger M.A. , Turk E., Wright E.M. ; 

RT "Homology of the human intestinal Na+/glucose and Escherichia coli 

RT Na+/proline cotransporters . " ; 

RL Proc. Natl. Acad. Sci. U.S.A. 86:5748-5752(1989). 

RN [2] 

RP SEQUENCE FROM N.A. 

RA Swan M. ; 

RL Submitted (JUN-1998) to the EMBL/ GenBank/DDBJ databases. 

RN [3] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=20057165; PubMed=10591208 ; 

RA Dunham I., Hunt A.R., Collins O'.E., Bruskiewich R., Beare D.M., 

RA Clamp M. , Smink L.J., Ainscough R. , Almeida J. P., Babbage A.K., 

RA Bagguley C, Bailey J., Barlow K.F., Bates K.N., Beasley O.P., 

RA Bird CP., Blakey S.E., Bridgeman A.M. , Buck D., Burgess J., 

RA Burrill W.D., Burton J., Carder C, Carter N.P., Chen Y. , Clark G. , 

RA Clegg S.M., Cobley V.E., Cole C.G., Collier R.E., Connor R. , 

RA Conroy D., Corby N.R., Coville G.J., Cox A.V., Davis J., Dawson E., 

RA Dhami P.D., Dockree C, Dodsworth S . J. , Durbin R.M., Ellington A.G., 

RA Evans K.L., Fey J.M. , Fleming K., French L., Garner A. A. , 

RA Gilbert J.G.R., Goward M.E., Grafham D.V. , Griffiths M.N.D., Hall C, 

RA Hall R.E., Hall-Tamlyn G., Heathcott R.W., Ho S . , Holmes S., 

RA Hunt S.E., Jones M.C., Kershaw J., Kimberley A.M., King A., 



RA Laird G.K., Langford C.F., Leversha M.A. , Lloyd C, Lloyd D.M., 

RA Martyn I.D., Mashreghi-Mohammadi M. , Matthews L.H., Mccann O.T., 

RA Mcclay J. , Mclaren S., McMurray A. A. , Milne S.A., Mortimore B.J., 

RA Odell C.N., Pavitt R. , Pearce A.V., Pearson D., Phillimore B.J.C.T., 

RA Phillips S.H., Plumb R.W., Ramsay H., Ramsey Y., Rogers L., Ross M.T., 

RA Scott C.E., Sehra H . K. , Skuce CD., Smalley S., Smith M.L., 

RA Soderlund C, Spragon L . , Steward C.A., Sulston J.E., Swann R.M., 

RA Vaudin M. , Wall M. , Wallis J.M., Whiteley M. N . , Willey D . L . , 

RA Williams L., Williams S.A., Williamson H., Wilmer T.E., Wilming L., 

RA Wright C.L., Hubbard T., Bentley D. R. , Beck S., Rogers J., Shimizu N., 

RA Minoshima S., Kawasaki K., Sasaki T., Asakawa S., Kudoh J., 

RA Shintani A., Shibuya K. , Yoshizaki Y. , Aoki N., Mitsuyama S., 

RA Roe B.A., Chen F. , Chu L., Crabtree J., Deschamps S., Do A., Do T\ , 

RA Dorman A., Fang F., Fu Y., Hu P., Hua A., Kenton S., Lai H., Lao H.I., 

RA Lewis J., Lewis S., Lin S.-P., Loh P., Malaj E., Nguyen T., Pan H., 

RA Phan S., Qi S., Qian Y. , Ray L., Ren Q., Shaull S., Sloan D., Song L. , 

RA Wang Q., Wang Y., Wang Z., White J., Willingham D., Wu H., Yao Z., 

RA Zhan M. , Zhang G. , Chissoe S., Murray J., Miller N., Minx P. f 

RA Fulton R., Johnson D., Bemis G., Bentley D., Bradshaw H. , Bourne S., 

RA Cordes M. , Du Z., Fulton L. , Goela D., Graves T., Hawkins J., 

RA Hinds K. , Kemp K, , Latreille P . , Layman D., Ozersky P., Rohlfing T., 

RA Scheet P., Walker C, Wamsley A., Wohldmann P., Pepin K. , Nelson J. f 

RA Korf I., Bedell J. A., Hillier L.W., Mardis E., Waterston R. , 

RA Wilson R. , Emanuel B.S., Shaikh T . , Kurahashi H., Saitta S., 

RA Budarf M.L., McDermid H.E. f Johnson A., Wong A.C.C., Morrow B.E., 

RA Edelmann L., Kim U.J., Shizuya H w Simon M.I., Dumanski J. P., 

RA Peyrard M., Kedra D., Seroussi E. f Fransson I., Tapia I., Bruder C.E., 

RA O'Brien K.P., Wilkinson P. f Bodenteich A. r Hartman K. f Hu X., 

RA Khan A.S., Lane L., Tilahun Y. , Wright H.; 

RT "The DNA sequence of human chromosome 22."; 

RL Nature 402:489-495(1999). 

RN [4] 

RP VARIANT GGM ASN-28. 

RX MEDLINE=91179516; PubMed=20082 13 ; 

RA Turk E., Zabel B., Mundlos S., Dyer J. r Wright E.M.; 

RT "Glucose/galactose malabsorption caused by a defect in the 

RT Na+/glucose cotransporter . " ; 

RL Nature 350:354-356(1991). 

RN [5] 

RP VARIANT GGM GLY-2 8. 

RX MEDLINE=94253082; PubMed=8 195156 ; 

RA Turk E., Martin M.G., Wright E.M.; 

RT "Structure of the human Na+/glucose cotransporter gene SGLT1 . " ; 

RL J. Biol. Chem. 269:15204-15209(1994). 

CC -!- FUNCTION: Actively transports glucose into cells by Na(+) 

CC co-transport with a Na(+) to glucose coupling ratio of 2:1. 

CC -!- FUNCTION: Efficient substrate transport in mammalian kidney is 

CC provided by the concerted action of a low affinity high capacity 

CC and a high affinity low capacity Na (+) /glucose cotransporter 

CC arranged in series along kidney proximal tubules. 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein. 

CC -!- TISSUE SPECIFICITY: Expressed mainly in intestine and kidney. 

CC -!- DISEASE: Defects in SLC5A1 are the cause of congenital glucose- 

CC galactose malabsorption (GGM) [MIM: 606824 ] . GGM is an intestinal 

CC monosaccharide transporter deficiency. It is an autosomal 

CC recessive disorder manifesting itself within the first weeks of 

CC life. It is characterized by severe diarrhea and dehydration which 
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are usually fatal unless glucose and galactose are eliminated from 
the diet. 

-!- SIMILARITY: BELONGS TO THE SODIUM: SOLUTE SYMPORTER FAMILY (SSF) . 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib . ch) . 



EMBL 
EMBL 
EMBL 
EMBL 
EMBL 
EMBL 
EMBL 
EMBL 
EMBL 
EMBL 
EMBL 
EMBL 
EMBL 
EMBL 
EMBL 
EMBL 
EMBL 
EMBL 



L29339; AAB59448.1 

L29328; AAB59448.1 

L29330; AAB59448.1 

L29329; AAB59448.1 

L29331; AAB59448.1 

L29332; AAB59448.1 

L29333; AAB59448.1 

L29334; AAB59448.1 

L29335; AAB59448.1 

L29336; AAB59448.1 

L29337; AAB59448.1 

L29338; AAB59448.1 

M24847; AAA60320.1 
AL022321; 
Z83849 



JOINED. 
JOINED. 
JOINED. 
JOINED. 
JOINED. 
JOINED. 
JOINED. 
JOINED. 
JOINED. 
JOINED. 
JOINED. 



; NOT_ANNOTATED_CDS . 
NOT_ANNOTATED_CDS . 
Z74021; -; NOT_ANNOTATED_CDS . 
Z 8 099 8; -; NOT_ANNOTATED_CDS . 
Z 83839; -; NOT_ANNOTATED_CDS . 
PIR; A33545; A33545. 
Genew; HGNC: 11036; SLC5A1. 
MIM; 182380; -. 
MIM; 606824; -. 

GO; GO: 0005887; C: integral to plasma membrane; TAS . 

Inter Pro; IPR001734; Na/solut_symport . 

Pfam; PF00474; SSF; 1. 

TIGRFAMs; TIGR00813; sss; 1. 

PROSITE; PS00456; NA_SOLUT_SYMP_l ; 1. 

PROSITE; PS00457; NA_SOLUT_SYMP_2 ; 1. 

PROSITE; PS50283; NA_SOLUT_SYMP_3 ; 1. 

Transport; Sugar transport; Transmembrane; Sodium transport; Symport; 
Glycoprotein; Disease mutation. 
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CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 
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POTENTIAL. 



FT DOMAIN 335 423 EXTRACELLULAR (POTENTIAL) . 

FT TRANSMEM 424 443 POTENTIAL. 

FT DOMAIN 444 455 CYTOPLASMIC (POTENTIAL) . 

FT TRANSMEM 456 476 POTENTIAL. 

FT DOMAIN 477 526 EXTRACELLULAR (POTENTIAL) . 

FT TRANSMEM 527 547 POTENTIAL. 

FT DOMAIN 54 8 642 CYTOPLASMIC (POTENTIAL) . 

FT TRANSMEM 64 3 663 POTENTIAL. 

FT CARBOHYD 248 248 N-LINKED (GLCNAC. . . ) (BY SIMILARITY). 

FT VARIANT 28 2 8 D -> G (IN GGM) . 

FT /FTId=VAR_013630. 

FT VARIANT 28 28 D -> N (IN GGM) . 

FT / FTId=VAR_0 07168. 

SQ SEQUENCE 664 AA; 73497 MW; 2B403376595EAB74 CRC64; 

Query Match 10.3%; Score 306; DB 1; Length 664; 

Best Local Similarity 22.8%; Pred. No. 4e-14; 

Matches 148; Conservative 104; Mismatches 218; Indels 178; Gaps 30; 

Qy 11 IIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGYING 7 0 

I :::::::: I I : I I : I I I : : I I : I : : I : : I I : I 

Db 32 IVIYFVWMAVGLWAMFST-NRGTV GGFFLAGRSMVWWPIGASLFASNIGSGHFVG 8 6 

Qy 71 TAEAVYVPGYGLAWAQAP I GYS LSLILGGLFFAKPMRSK-GYVTMLDPFQQI YGK 124 

I III II: I : : I I I I I : I I I II : I 

Db 87 LA GTGAASGIAIGGFEWNALVLVWLGWLFV — PIYIKAGWTM PEYLRK 134 

Qy 125 RMGG LL FI PALMGEMFWAAAI FSALGAT I SVI I DVDMHI S VI I SAL I A 172 

III I I : I : : : I I I | :::::::::: : I 

Db 135 RFGGQRIQVYLSLLSLLLYIFTKISADIFSGAIF INLALGLNLYLAI FLLLAIT 188 

Qy 173 T L YT LVGGL Y S VAYT DWQ L F C I FVGLW I S VP FAL S H P AVAD I GFT AVHAK YQ K — PWL- 229 

I I I : I I I : I I I I : I : I I I II hi Mil: 

Db 189 ALYTITGGLAAVI YTDTLQTVIMLVGSLILTGFAFHEVG GYDAFMEKYMKAI PTI V 244 

Qy 230 GTVDSSEVYS-WLDSFLLL MLGGIPW QAYFQRVLSS 2 64 

I : | : | | t : : t : I I I I I I I : 

Db 245 SDGNTTFQEKCYTPRADSFHIFRDPLTGDLPWPGFIFGMSILTLWYWCTDQVIVQRCLSA 304 

Qy 265 SSATYAQ VLSFLAAFGCLVMAI PAIL IGAI GASTDWNQT 303 

: : : : : : | : | : | : : I : I 

Db 305 KNMSHVKGGCILCGYLKLMPMFIMVMPGMISRILYTEKIACWPSECEKYCGTKVGCTNI 364 

Qy 304 AYGLPDPKTTEEADMI LPI VLQYLCPVYI S FFGLGAVSAAVMS S ADS S I LSAS SMFARNI 363 

II | : : | | : | : | : : | | | | | | | : : | : I 

Db 355 AY PTLWELMPNGLRGLMLS VMLAS LMS SLTS I FN 3ASTLFTMDI 409 

Qy 364 YQLS FRQNASDKEIVWVMRITVFV- FGASATAMALLTKTVYG — LWYLSSDLVYI--VI F 418 

I | : | | : | | : : I : : I I I : : : I I : I I : I 

Db 410 Y-AKVRKRASEKELMIAGRLFILVLIGISIAWVPIVQSAQSGQLFDYIQSITSYLGPPIA 468 

Qy 419 PQLLCVLFVKGTNTYGAVAGYVSGLFLRI TG GEPYLY 4 55 

I : I I I I I I : I I : I II I Ml 

Db 469 AVFLLAIFWKRVNEPGAFWGLILGLLIGISRMITEFAYGTGSCMEPSNCPTIICGVHYLY 528 

Qy 456 LQPLIFYPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFD 515 

: : I M : I : M I I : ' I : : : 



Db 



529 FAIILF 



•AISFITIWISLLTKPI PDVHLYR 558 



Qy 516 AV — VARHSEENMDKTILVKNENIKLDELALVKPRQSMTLSSTFTNKE 561 

: | | : | : : | M : I : : : : : : I : 

Db 559 LCWSLRNSKEERID--LDAEEENIQ EGPKETIEIETQVPEKK 598 

RESULT 4 
SL51_RAT 

ID SL51_RAT STANDARD; PRT; 665 AA. 

AC P53790; P97787; 

DT 01-OCT-1996 (Rel. 34, Created) 

DT 01-OCT-1996 (Rel. 34, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Sodium/glucose cotransporter 1 (Na (+) /glucose cotransporter 1) 

DE (High affinity sodium-glucose cotransporter) . 

GN SLC5A1 OR SGLT1. 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus. 

OX NCBI_TaxID=10116; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Sprague-Dawley; TISSUE=Kidney ; 

RX MEDLINE=94216314; PubMed=816350 6; 

RA Lee W.S., Kanai Y., Wells R.G., Hediger M.A. ; 

RT "The high affinity Na+/glucose cotransporter. Re-evaluation of 

RT function and distribution of expression."; 

RL J. Biol. Chem. 269:12 032-12039(1994). 

RN [2] 

RP SEQUENCE FROM N.A. 

RA Kasahara M. , Mori K. ; 

RL Submitted (MAY-1993) to the EMBL/ GenBank/DDBJ databases. 

RN [3] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Sprague-Dawley; TISSUE= Jejunum; 

RA Aoshima H., Yokoyama T., Tanizaki J., Izu H., Yamada M. ; 

RT "The sugar specificity of Na/glucose cotransporter from rat jejunum."; 

RL Submitted (JAN-1997) to the EMBL/ GenBank/DDBJ databases. 

CC -!- FUNCTION: Actively transports glucose into cells by Na (+) co- 

CC transport with a Na(+) to glucose coupling ratio of 2:1. 

CC -!- FUNCTION: Efficient substrate transport in mammalian kidney is 

CC provided by the concerted action of a low affinity high capacity 

CC and a high affinity low capacity Na (+) /glucose cotransporter 

CC arranged in series along kidney proximal tubules. 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein. 

CC -!- DEVELOPMENTAL STAGE: Appears on embryonic day 18 and gradually 

CC increases until birth. 

CC -!- SIMILARITY: BELONGS TO THE SODIUM: SOLUTE SYMPORTER FAMILY (SSF). 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 
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PIR; A535 


82; A53582. 




DR 


InterPro; 


IPR001734; Na/solut symport. 


DR 


Pfam; PF00474; SSF; 1. 
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TIGR00813; sss; 


1 . 
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PROSITE; 


PS00456; 
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SYMP 1; 1. 
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PROSITE; 
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NA SOLUT 


SYMP 2 ; 1 . 


DR 


PROSITE; 


PS50283; 


NA SOLUT 
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POTENTIAL. 
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POTENTIAL. 


FT 


DOMAIN 
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CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


644 


664 


POTENTIAL. 


FT 


CARBOHYD 


248 


248 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CONFLICT 


354 


354 


Y -> H (IN REF. 3) . 


SQ 


SEQUENCE 


665 AA; 73066 


MW; A92038D964BFF061 CRC64; 



Query Match 10.3%; Score 306; DB 1; Length 665; 

Best Local Similarity 23.5%; Pred. No. 4e-14; 

Matches 155; Conservative 105; Mismatches 242; Indels 158; Gaps 29; 

Qy 11 IIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGYING 70 

| :::::::: | | : | | : I I I : : I I : I : : I : : I I : I 

Db 32 IVIYFWVMAVGLWAMF3T-NRGTV GGFFLAGRSMVWWPIGASLFASNIGSGHFVG 86 

Qy 71 T AEAVYVP G YGLAWAQAP I G Y S L S LILGGLFFAKPMRSK-GYVTMLDPFQQIYGK 124 

I Ml II:: : : t I I I i : I I I I I = I 

Db 87 LA GTGAAAGIAMGGFEWNALVFVWLGWLFV — PIYIKAGWTM PEYLRK 134 

Qy 125 RMGG LLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALI A 172 

I I I I I : I : : : M I | :::: | | I 

Db 135 RFGGKRIQI YLSVLSLLLYIFTKI SADI FSGAI F INLALGLDI YLAIFILLAIT 188 

Qy 173 T L YT L VGGL Y S VAYT DWQL FC I FVGLW I S VP FAL S H P AVAD I GFT AVHAK YQ K- - PWL - 229 

I I I : I I I : I I I I : I : I I : I II hi I I I I I 



Db 



189 ALYTI TGGLAAVT YTDTLQTAIMLVGS FI LTGFAFREVG GYEAFMDKYMKAI PTLV 244 



Qy 230 --GTVD-SSEVYS-WLDSFLLL MLGGIPW QAYFQRVLSS 264 

I : II: III: : I : I I I Mil: 

Db 245 SDGNITVKEECYTPRADSFHIFRDPITGDMPWPGLIFGLSILALWYWCTDQVIVQRCLSA 304 

Qy 2 65 S SATYAQVLS FLAAFGCLVMAI PAI LI GAI GASTDWNQTAYGLPDP KTTEEADM 318 

: : : : I : I : : : I I : : I I I : : 

Db 305 KNMSHVKAGCTLCGYLKLLPMFLMVMPGMISRILYTDKIACVLPSECKKYCGTPVGCTNI 364 

Qy 319 ILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFARNIYQLSFRQNASDKEIV 378 

I : : II : I : I : : I I I I I I I : : I Ml I : I I : II : : 

Db 365 AYPTLWELMPNGLRGLMLSVMMASLMSSLTSIFNSASTLFTMDIY-TKIRKGASEKELM 423 

Qy 379 WVMRITVFV-FGASATAMALLTKTVYG — LWYLSSDLVYI--VIFPQLLCVLFVKGTNTY 433 

I : : I I I : : : I Ml M I I : I I I 

Db 424 IAGRLFILVLIGISIAWVPIVQSAQSGQLFDYIQSITSYLGPPIAAVFLLAIFCKRVNEP 483 

Qy 434 GAVAGYVSGLFLRI TG GEPYLYLQPLIFYPGYYPDDN 470 

I I I : I : I II I I I I : : I 
Db 484 GAFWGLILGFLI GI SRMITEFAYGTGSCMEPSNCPKI ICGVHYLYFAI I LF 534 

Qy 471 GIYNQKFPFKTLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAV — VARHSEENMDK 528 

I : I : I I I I : I : : : : M I : I 

Db 535 AISWTVLVISLLTKPI PDVHLYRLCWSLRNSTEERID- 572 

Qy 529 TILVKNENIKLDELALVKPRQSMTLSSTFTNKE AFL DVD SSPEGSGTED 577 

I I : : I | : : : : : | | II I I : : M 

Db 573 — LDAGEEEPVEE DPKDTIEIDAEAPQKEKGCFRKAYDLFCGLDQDKGPKMTKEEE 626 



RESULT 5 
SL54_PIG 

ID SL54_PIG STANDARD; PRT; 660 AA. 

AC P31636; 

DT 01-JUL-1993 (Rel. 26, Created) 

DT 01-JUL-1993 (Rel. 26, Last sequence update) 

DT 16-OCT-2001 (Rel. 40, Last annotation update) 

DE Low affinity sodium-glucose cotransporter ( Sodium/ glucose 

DE cotransporter 3) (Na (+) /glucose cotransporter 3). 

GN SLC5A4 OR SGLT3 OR SAAT1. 

OS Sus scrofa (Pig) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Cetartiodactyla; Suina; Suidae; Sus. 

OX NCBI_TaxID=9823; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Kidney; 

RX MEDLINE=93131881; PubMed=8420925; 

RA Kong C.-T., Yet S.-F., Lever J.E.; 

RT "Cloning and expression of a mammalian Na+/amino acid cotransporter 

RT with sequence similarity to Na+/glucose cotransporters. "; 

RL J. Biol. Chem. 2 68:1509-1512(1993). 

RN [2] 

RP FUNCTION. 

RX MEDLINE=94357885; PubMed=8 077 195 ; 

RA McKenzie B., Panayotova-Heiermann M. , Loo D.D.F., Lever J.E., 



RA Wright E.M. ; 

RT "SAAT1 is a low affinity Na+/glucose cotransporter and not an amino 

RT acid transporter. A reinterpretation . " ; 

RL J. Biol. Chem. 269:22488-224 91(1994). 

CC -!- FUNCTION: Sodium-dependent glucose transporter. Has a Na+ to 
CC glucose coupling ratio of 1:1. 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein. 

CC -!- TISSUE SPECIFICITY: KIDNEY, INTESTINE, LIVER, SKELETAL MUSCLE, 
CC AND SPLEEN. 

CC -!- SIMILARITY: BELONGS TO THE SODIUM : SOLUTE SYMPORTER FAMILY (SSF). 

CC -!- CAUTION: Was originally (Ref.l) thought to be a sodium/neutral 
CC amino acid cotransporter (system a neutral amino acid transporter) 

CC responsible for the sodium-dependent intake of neutral amino acids 

CC such as alanine, glycine, serine, cysteine, and proline. 













cc 


This SWISS-PROT entry is copyright. It is produced through a collaboration 


cc 


between 


the Swiss Institute of Bioinf ormatics and the EMBL outstation - 


cc 


the European Bioinf ormatics 


Institute. There are no restrictions on its 


cc 


use by 


non-profit institutions as long as its content is in no way 


cc 


modified 


and this 


statement 


is not removed. Usage by and for commercial 


cc 


entities 


requires 


a license 


agreement (See http://www.isb-sib.ch/announce/ 


cc 
cc 

DR 


or send an email 


to license@isb-sib.ch). 


EMBL; L02900; AAC37325.1; - 




DR 


PIR; A44432; A44432. 




DR 


InterPro; 


IPR001734; Na/solut symport. 


DR 


Pfam; PF00474; SSF; 1. 




DR 


TIGRFAMs; 


TIGR00813; sss; 1 




DR 


PROSITE; 


PS00456; 


NA_SOLUT_ 


SYMP_1; 1. 


DR 


PROSITE; 


PS00457; 


NA SOLUT 


SYMP 2; 1. 


DR 


PROSITE; 
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FT 


CARBOHYD 


248 


248 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


SQ 


SEQUENCE 


660 AA; 72745 MW; 38616367F8F18F1A CRC64; 



Query Match 10.2%; Score 303.5; DB 1; Length 660; 

Best Local Similarity 23.2%; Pred. No. 5.9e-14; 

Matches 141; Conservative 103; Mismatches 230; Indels 135; Gaps 26; 

Qy 11 IIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGYING 7 0 

I :::::::: II : I I MM: : I I I : I :: I : : I I : I 

Db 32 IVIYFWVMAVGLWAMLRT-NRGTV GGFFLAGRDVTWWPMGAS L FASN I GS GH FVG 8 6 

Qy 71 TAEAVYVPGYGLA WAQAP I GYS LS L I LGGL FFAKPMRS KGYVTMLDP FQQ I Y- GKRM 126 

I I : I I | | : : | | | | : | | : : : : | | | : 

Db 87 LAGTGAASGIAIAAFEW NALLLLLVLGWFFVPI YIKAGVMTMPEYLRKRFGGKRL 141 

Qy 127 GGLLFIPAL MGEMFWAAAI FS ALGAT I S VI I DVDMH I S VI I S ALI AT L YT LVG 179 

I I : I : : : I I I | : : : | : : : : : | : | | : | 

Db 142 QIYLSILSLFICVALRISSDIFSGAIF IKLALGLDLYLAI FSLLAITAIYTITG 195 

Qy 180 GL Y S VAYT D WQ L F C I FVGLW I S VP FAL S H P AVAD I G FT AVHAK YQ K- - P W LGT VD 233 

I I II I I I : I : : I : I : I I I I : : II I : I 

Db 196 GLASVIYTDTLQTIIMLIGSFILMGFAF VEyGGYESFTEKYMNAI PTIVEGDNLTI 251 

Qy 234 SSEVYS-WLDSFLLL MLGGIPW QAYFQRVLSSSSATYAQ 271 

I : I : I I I : : I I M I I I I I : : : 

Db 252 SPKCYTPQGDSFHIFRDAVTGDIPWPGMIFGMTWAAWYWCTDQVIVQRCLSGKDMSHVK 311 

Qy 272 VLS FLAAFGCLVMAI PAI LI GAI GASTDWNQTAYGLPDPKT TEE — ADMILPIVLQ 325 

: : I : : : I I : I : I II : : I : : : 

Db 312 AAC IMCGYLKLLPMFLMVMP GMI S RI L YT EKVACWP S ECVKHCGTEVGC SN YAYP LLVM 371 

Qy 326 YLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFARNI YQLS FRQNAS DKEI VWVMRITV 385 

II : I : I : : I I I I I I I : : I : : I I : I I : I I : : I : : 

Db 372 ELMPSGLRGLMLSVMLASLMSSLTSIFNSASTLFTMDLY-TKIRKQASEKELLIAGRLFI 4 30 

Qy 386 FVFGAS AT AMALLT KTVYG LWYLS SDLVYI — VI FPQLLCVLFVKGTNTYGA V 436 

: : I : I : I I : I I : I I I I I : 

Db 431 ILLIVISIVWPLVQVAQNGQLFHYIESISSYLGPPIAAVFLLAIFCKRVNEQGAFWGLI 490 

Qy 437 AGYVSGL FLRITG -GEPYLYLQPLIFYPGYYPDDNGI YNQKF 477 

1:111 I : I I | | | | : : | : 
Db 4 91 IGFVMGLIRMIAEFVYGTGSCLAASNCPQIICGVHYLYFALILFF 535 

Qy 478 PFKTLAMVT S FLTNI CI S YLAK- YLFE S GT L P P KL D VFDAWARH 521 

I I : I I I I : I : : : : I : I I II 

Db 536 VSILWLAISLLTKPIPDVHLYRLCWALRNSTEERIDL-DAEEKRHEEAHDG 586 

Qy 522 -SEENMDKT 529 

I : I : : I 

Db 587 VDEDNPEET 595 



RESULT 6 
SL52_RABIT 

ID SL52_RABIT STANDARD; PRT; 672 AA. 

AC P26430; 

DT 01-AUG-1992 (Rel. 23, Created) 

DT 01-AUG-1992 (Rel. 23, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 



DE Sodium/nucleoside cotransporter (Na (+) /nucleoside cotransporter) . 

GN SLC5A2 OR SNST1. 

OS Oryctolagus cuniculus (Rabbit) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Lagomorpha; Leporidae; Oryctolagus. 

OX NCBI_TaxID=998 6; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Kidney; 

RX MEDLINE=92156077; PubMed=1740408; 

RA Pajor A.M., Wright E.M. ; 

RT "Cloning and functional expression of a mammalian Na+/nucleoside 

RT cotransporter. A member of the SGLT family."; 

RL J. Biol. Chem. 2 67:3557-3560(1992). 

CC -!- FUNCTION: Actively transports uridine into cells by Na+ 

CC co-transport. May play a role in reabsorption of nucleosides from 

CC glomerular filtrate by the proximal tubule in kidney, and in the 

CC regulation of cardiac contractility by adenosine. 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein. 

CC -!- TISSUE SPECIFICITY: More abundant in heart than in kidney, where 
CC it is absent from the outer cortex. 

CC -!- SIMILARITY: BELONGS TO THE SODIUM : SOLUTE SYMPORTER FAMILY (SSF). 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; M84020; AAA31421.1; -. 

DR PIR; A42251; A42251. 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR TIGRFAMs; TIGR00813; sss; 1. 

DR PROSITE; PS00456; NA__S0LUT_S YMP_1 ; 1. 

DR PROSITE; PS00457; NA_S0LUT_SYMP_2 ; 1. 

DR PROSITE; PS50283; NA_S0LUT_SYMP_3 ; 1. 

KW Transport; Transmembrane; Sodium transport; Symport; Glycoprotein. 
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POTENTIAL. 
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CARBOHYD 
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N-LINKED (GLCNAC. . .) (POTENTIAL) 


SQ 


SEQUENCE 


672 AA; 


73161 


MW; E2D987B03B9C57B4 CRC64; 



Query Match 10.0%; Score 298; DB 1; Length 672; 

Best Local Similarity 25.0%; Pred. No. 1.5e-13; 

Matches 153; Conservative 89; Mismatches 232; Indels 138; Gaps 25; 

Qy 9 IAII-VFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGY 67 

I I : I : : I I : : I I : I : II I h : I I : I : : I : : I I : 

Db 2 6 IAVIAAYFLLVI GVGLWSMCRT-NRGTV GGYFLAGRSMVWWPVGASLFASNIGSGH 8 0 

Qy 68 INGTAEAVYVPGYGLAWAQAPIGYSLS LILGGLFFAKPMRSKGYVTMLDPFQQI YG 123 

II III II:: : : I I I I : I : I I I 
Db 81 FVGLA GT GAAN GLAVAG FEWNAL FWLL LGWL FAP VYLT AGVT TM PQYLR 130 

Qy 124 KRMGG LLFI PALMGEMFWAAAI F — SALGATI SVI IDVDMHI SVI I SA 169 

I I I I I : I : : : I : I I I I I : I I I 

Db 131 KRFGGHRIRLYLSVLSLFLYIFTKISVDMFSGAVFIQQALGWNI YASVIALL 182 

Qy 17 0 L I AT L YT L VG G L Y S VAYT D WQ L F C I FVG LW I S V P F AL S H P AVAD I G FT AVHAK Y 224 

1 : I I : I I I : : II I I I I I I : I : I I : : : II 

Db 183 GITMVTTVTGGLAALMYTDTVQTFVIIAGAFILTGYAFHEVG GYSGLFDKYMGAMT 238 

Qy 225 QKPWLGTVDSSEVYSWLDSFLLL MLGGIPW QAYF 258 

: I : I : I I I I : I I : I : I I I 

Db 239 SLTVSEDPAVGNISSSCYRPRPDSYHLLRDPVTGDLPWPALLLGLTIVSGWYWCSDQVIV 298 

Qy 259 QRVL S S S S AT YAQ VL S FLAAFGC LVMAI P AI L I GAI GAS T DWNQT AYGL P D P KT TE 314 

llh : I : : I : I : : I I : : I I : II 

Db 299 QRCLAGRNLTHIKAGCILCGYLKLTPMFLMVMPGMISRILYPDEVACVAPEVCKRVCGTE 358 

Qy 315 E- -ADMI LP I VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFARNI YQLS FRQNA 372 

: : : I : : II : I ■ : I I : I I I I I : I : : I : I I I I I 
Db 359 VGCSNIAYPRLVVKLMPNGLRGLMLAWILAALMSSLASIFNSSSTLFTMDI YTL — RPRA 416 

Qy 373 SDKEIVWMRITVFVFGASATAMALLTKTVYG LWYLSSDLVYIV — IFPQLLCVLFV 427 

: I : : I I : I I : I : : I hi I : : : I I I 

Db 417 GEGELLLVGRLWWFIVAVSVAWLPWQAAQGGQLFDYIQSVSSYLAPPVSAVFWALFV 476 

Qy 428 KGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMV — 485 

I II I : I I : : I : I I I : I 

Db 477 PRVNEKGAFWGLIGGLLMGLARLIP EFS FGTGSCVRP 513 

Qy 486 TSFLTNICISYLAKYLFE-SG TLP-PKLDVFDAWA-RHSEENMDKTI 530 

: I I : I I I I I I |||:: I : I I I : I 
Db 514 SACPAFLCRVHYLYFAIVLFFCSGLLIIIVSLCTAPIPRKHLHRLVFSLRHSKE 567 

Qy 531 LVKNENIKLDEL 542 

: I :: III 
Db 568 — EREDLDADEL 57 7 



RESULT 7 
SL54_HUMAN 

ID SL54_HUMAN STANDARD; PRT; 659 AA. 

AC Q9NY91; 015279; 

DT 16-OCT-2001 (Rel. 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 

DE Low affinity sodium-glucose cotransporter (Sodium/glucose 

DE cotransporter 3) (Na (+) /glucose cotransporter 3). 

GN SLC5A4 OR SAAT1 OR SGLT2 . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 
RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Small intestine; 

RA Gorboulev V. , Baumgarten K. , Veyhl M. , Koepsell H.; 

RT "The molecular cloning and functional characterization of the human 

RT SGLT2 transporter."; 

RL Submitted (FEB-2000) to the EMBL/ GenBank/DDBJ databases. 

RN [2] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=20057165; PubMed-105912 08 ; 

RA Dunham I., Hunt A.R., Collins J.E., Bruskiewich R. , Beare D.M., 

RA Clamp M. , Smink L.J., Ainscough R. , Almeida J. P., Babbage A.K., 

RA Bagguley C, Bailey J., Barlow K.F., Bates K.N., Beasley O.P., 

RA Bird CP., Blakey S.E., Bridgeman A.M., Buck D., Burgess J., 

RA Burrill W.D., Burton J., Carder C, Carter N.P., Chen Y. , Clark G. , 

RA Clegg S.M., Cobley V.E., Cole C.G., Collier R.E., Connor R. , 

RA Conroy D., Corby N.R., Coville G.J., Cox A.V., Davis J., Dawson E., 

RA Dhami P.D., Dockree C, Dodsworth S.J., Durbin R.M., Ellington A.G., 

RA Evans K.L., Fey J.M., Fleming K., French L. , Garner A. A. , 

RA Gilbert J.G.R., Goward M.E., Grafham D.V., Griffiths M.N.D., Hall C. 

RA Hall R.E., Hall-Tamlyn G., Heathcott R.W., Ho S . , Holmes S., 

RA Hunt S.E., Jones M.C., Kershaw J., Kimberley A.M., King A., 

RA Laird G.K., Langford C.F., Leversha M.A. , Lloyd C, Lloyd D.M., 

RA Martyn I.D., Mashreghi-Mohammadi M., Matthews L.H., Mccann O.T., 

RA Mcclay J., Mclaren S., McMurray A. A. , Milne S.A., Mortimore B.J., 

RA Odell C.N., Pavitt R. , Pearce A.V., Pearson D., Phillimore B.J.C.T., 

RA Phillips S.H., Plumb R.W., Ramsay H., Ramsey Y. , Rogers L., Ross M.T 

RA Scott C.E., Sehra H.K., Skuce CD., Smalley S., Smith M.L., 

RA Soderlund C, Spragon L., Steward C.A., Sulston J.E., Swann R.M., 

RA Vaudin M. , Wall M. , Wallis J.M., Whiteley M.N., Willey D.L., 

RA Williams L., Williams S.A., Williamson H., Wilmer T.E., Wilming L., 

RA Wright C.L., Hubbard T., Bentley D.R., Beck S., Rogers J., Shimizu N 

RA Minoshima S., Kawasaki K. , Sasaki T., Asakawa S., Kudoh J., 

RA Shintani A., Shibuya K., Yoshizaki Y., Aoki N., Mitsuyama S., 

RA Roe B.A., Chen F., Chu L., Crabtree J., Deschamps S., Do A., Do T., 

RA Dorman A., Fang F. , Fu Y. , Hu P., Hua A., Kenton S., Lai H., Lao H.I 

RA Lewis J., Lewis S., Lin S.-P., Loh P., Malaj E . , Nguyen T., Pan H., 

RA Phan S., Qi S., Qian Y., Ray L. , Ren Q., Shaull S., Sloan D., Song L 

RA Wang Q., Wang Y. , Wang Z., White J., Willingham D., Wu H., Yao Z., 

RA Zhan M. , Zhang G., Chissoe S., Murray J., Miller N., Minx P., 

RA Fulton R., Johnson D., Bemis G., Bentley D., Bradshaw H., Bourne S., 

RA Cordes M., Du Z . , Fulton L., Goela D., Graves T., Hawkins J., 

RA Hinds K. , Kemp K. , Latreille P., Layman D., Ozersky P., Rohlfing T., 



RA Scheet P., Walker C, Wamsley A., Wohldmann P., Pepin K. , Nelson J., 

RA Korf I., Bedell J. A., Hillier L.W. , Mardis E., Waterston R. , 

RA Wilson R. , Emanuel B.S., Shaikh T., Kurahashi H., Saitta S., 

RA Budarf M.L., McDermid H.E., Johnson A., Wong A.C.C., Morrow B.E., 

RA Edelmann L., Kim U.J., Shizuya H., Simon M.I., Dumanski J. P., 

RA Peyrard M., Kedra D., Seroussi E., Fransson I., Tapia I. f Bruder C.E., 

RA 0 ! Brien K.P., Wilkinson P., Bodenteich A., Hartman K. , Hu X., 

RA Khan A.S., Lane L., Tilahun Y. , Wright H.; 

RT "The DNA sequence of human chromosome 22." ; 

RL Nature 402:489-495(1999). 

RN [3] 

RP SEQUENCE OF 73-247 FROM N.A. 

RC TISSUE=Brain; 

RA Poppe R., Koepsell H.; 

RL Submitted (DEC-1995) to the EMBL/ GenBank/ DDB J databases. 

CC -!- FUNCTION: Sodium-dependent glucose transporter (By similarity). 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein. 

CC -!- SIMILARITY: BELONGS TO THE SODIUM : SOLUTE SYMPORTER FAMILY (SSF). 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.ch). 



CC 
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EMBL; AJ133127; CAB81772.1; 




DR 


EMBL; AL008723; CAB51758.1; 




DR 


EMBL; U41897; AAB61732.1; - 




DR 


Genew; HGNC: 11039; SLC5A4 . 




DR 


InterPro; 


IPR001734; Na/solut_symport . 


DR 


Pfam; PF00474; SSF; 1. 




DR 


TIGRFAMs; 


TIGR00813; sss; 1 




DR 


PROSITE; 


PS00456; 


NA_SOLUT_ 


SYMP 1; 1. 


DR 


PROSITE; 


PS00457; 


NA_S0LUT_ 


SYMP_2 ; 1 . 


DR 


PROSITE; 


PS50283; 


NA_SOLUT_ 


SYMP_3; 1. 


KW 


Transport; Sugar 
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KW 


Glycoprotein. 
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FT 


CARBOHYD 
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N-LINKED (GLCNAC. . . ) (POTENTIAL) 


FT 


CONFLICT 


76 


76 


A -> V (IN REF. 3) . 


FT 


CONFLICT 


106 


106 


S -> P (IN REF. 3) . 


FT 


CONFLICT 


243 


243 


V -> I (IN REF. 3) . 


SQ 


SEQUENCE 


659 AA; 


72455 


MW; F8A34AED648B523A CRC64; 



Query Match 9.9%; Score 294; DB 1; Length 659; 

Best Local Similarity 22.5%; Pred. No. 2.7e-13; 

Matches 135; Conservative 97; Mismatches 225; Indels 144; Gaps 22; 

Qy 11 IIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGYING 70 

I :: : : I : : : I I : I I : I I I : : I I I : | : : | : : | : | 

Db 32 IVIYFLWMAVGLWAMLKT-NRGTI GGFFLAGRDMAWWPMGAS LFASNI GSNH YVG 86 

Qy 71 TAEAVYVP GYGLAWAQAP I G Y S LSLILGGLFFAKPMRSKGYVTMLDPFQQI YGKR 125 

I III I : : : I I I I : I :: I I : I I : II 

Db 87 LA GTGAAS GVATVT FEWT S S VMLLI LGWI FVP I YI KS - GVMTM PEYLKKR 135 

Qy 126 MGG LLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALI AT 173 

II : : I : Mill I : : : I :: : : : I : 

Db 136 FGGERLQVYLSILSLFICWLLISADIFAGAIF 1 KLALGLDLYLAI FI LLAMTA 189 

Qy 174 L YT LVGGL YS VAYT DWQLFC I FVGLWI S VP FAL S HPAVADI GFTAVHAKYQKPWLGTVD 233 

: I I I I I I I I I I : I : : I : | : | | : | : : | | I : 

Db 190 VYTTTGGLASVI YTDTLQTI IMLI GS FI LMGFAFNEVG GYESFTEKYVNATPSWE 245 

Qy 234 SSEVYS-WLDSFLLL MLGGIPW QAYFQRVLSSS 265 

I : I : III: : I I I I I I I I 

Db 246 GDNLTISASCYTPRADSFHIFRDAVTGDIPWPGIIFGMPITALWYWCTNQVIVQRCLCGK 305 

Qy 266 SAT YAQVL S FLAAFGCLVMAI PAI L I GAI GAS T DWNQT AYGL P DPKTTEEA 316 

: : : : I : I : : : I I : I : I I II 

Db 306 DMSHVKAACIMCAYLKLLPMFLMVMPGMISRILYTDMVACWPSECVKHCGVDVGCTNYA 365 

Qy 317 DMI LP I VLQ YLC PVYI S FFGLGAVSAAVMS S ADS S I L SAS SMFARN I YQL S FRQNAS DKE 376 

I : : II : I : I : : I I I I I I I :: I : : I I : I I : i I 

Db 366 YPTMVLELMPQGLRGLMLSVMLASLMSSLTSIFNSASTLFTIDLY-TKMRKQASEKE 421 

Qy 377 IVWVMRITVFVFGASATAMALLTKTVYG LWYLSSDLVYI — VI FPQLLCVLFVKGTN 431 

: : III: : I : : I I I : I : : I I I 

Db 422 LLIAGRI FVLLLTWSIWVPLVQVSQNGQLIHYTES I SSYLGPPIAAVFVLAI FCKRVN 4 81 

Qy 432 TYGAVAGYVSGLFL RITGGEPYLYLQPLI FYPGYYPD 468 

I I I : I I : : I. I I I I : : I : 
Db 4 82 EQGAFWGLMVGLAMGLI RMITEFAYGTGSCLAPSNCPKI ICGVHYLYFS IVLFF 535 

Qy 469 DNGIYNQKFPFKTLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDA — WARHSEENM 526 

I I : I I I I : I : :: I : : I I : 

Db 536 GSMLVTLGI SLLTKPI PDVHLYRLCWVLRNSTEERI 571 



Qy 

Db 



527 D 527 
I 

572 D 572 



RESULT 8 
SL51J5HEEP 

ID SL51_SHEEP STANDARD; PRT; 664 AA. 

AC P53791; 

DT 01-OCT-1996 (Rel. 34, Created) 

DT 01-OCT-1996 (Rel. 34, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Sodium/glucose cotransporter 1 (Na (+) /glucose cotransporter 1) 

DE (High affinity sodium-glucose cotransporter) . 

GN SLC5A1 OR SGLT1 . 

OS Ovis aries (Sheep) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Cetartiodactyla; Ruminantia; Pecora; Bovoidea; 

OC Bovidae; Caprinae; Ovis. 

OX NCBI_TaxID=9940; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Jejunal mucosa; 

RX MEDLINE=96077158; PubMed=74 92327 ; 

RA Tarpey P., Wood I.S., Shirazi-Beechey S.P., Beechey R.B.; 

RT "Amino acid sequence and the cellular location of the Na (+) -dependent 

RT D-glucose symporters (SGLT1) in the ovine enterocyte and the parotid 

RT acinar cell."; 

RL Biochem. J. 312:293-300(1995). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Mammary gland; 

RX MEDLINE=98050042 ; PubMed=938868 8 ; 

RA Shillingford J.M., Wood I.S., Shennan D.B., Shirazi-Beechey S.P., 

RA Beechey R. B. ; 

RT "Determination of the sequence of a mRNA from lactating sheep mammary 

RT gland that encodes a protein identical to the Na (+) -dependent glucose 

RT transporter (SGLT1) . "; 

RL Biochem. Soc. Trans. 25:467-467(1997). 

CC -!- FUNCTION: Actively transports glucose into cells by Na(+) co- 

CC transport with a Na( + ) to glucose coupling ratio of 2:1. 

CC -!- FUNCTION: Efficient substrate transport in mammalian kidney is 

CC provided by the concerted action of a low affinity high capacity 

CC and a high affinity low capacity Na (+) /glucose cotransporter 

CC arranged in series along kidney proximal tubules. 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein. 

CC -!- SIMILARITY: BELONGS TO THE SODIUM: SOLUTE SYMPORTER FAMILY (SSF) . 

CC 

CC This SWISS-PROT entry is copyright. It Is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; X82411; CAA57809.1; -. 

DR EMBL; X82410; CAA57808.1; -. 

DR EMBL; AJ001026; CAA04483.1; 

DR PIR; S59637; S59637. 



DR 


InterPro; 


IPR001734; Na/solut symport. 


UK 


Pfam; PF00474; SSF; 1. 




UK 


TIGRFAMs; 


TIGR00813; sss; 


1 . 


UK 


PROSITE; 


PS00456; 


1T7\ f-i /-\ T TTm 

NA SOLUT 


SYMP 1; 1. 


DR 


PROSITE; 


PS00457; 


t t t\ OAT TTf 

NA SOLUT 


SYMP Z; 1. 


UK 


PROSITE; 


PS50283; 


xt t\ CAT TTrp 

NA SOLUT 


SYMP 3; 1. 


KW 


Transport; Sugar 


transport 


; Transmembrane; Sodium trans 


VT«7 
KW 


Glycoprotein . 






FT 


DOMAIN 


1 


28 


CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


29 


4 / 


POTENTIAL . 


FT 


DOMAIN 


48 


64 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


65 


85 


POTENTIAL . 


FT 


DOMAIN 


86 


105 


CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


106 


126 


POTENTIAL. 


FT 


DOMAIN 


127 


171 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


172 


192 


POTENTIAL . 


FT 


DOMAIN 


193 


208 


CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


209 


229 


POTENTI7VL . 


FT 


DOMAIN 


230 


270 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


271 


291 


POTENTIAL. 


FT 


DOMAIN 


292 


314 


CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


315 


334 


POTENTIAL. 


FT 


DOMAIN 


335 


423 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


424 


443 


POTENTIAL. 


FT 


DOMAIN 


444 


455 


CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


456 


476 


POTENTIAL . 


FT 


DOMAIN 


477 


526 


EXTRACELLULAR ( POTENTI7VL) . 


FT 


TRANSMEM 


527 


547 


POTENTIAL. 


FT 


DOMAIN 


548 


642 


CYTOPLASMIC ( POTENTIAL) . 


FT 


TRANSMEM 


643 


663 


POTENTIAL . 


FT 


CARBOHYD 


248 


248 


N-LINKED (GLCNAC. . .) (PC 


SQ 


SEQUENCE 


664 AA 


; 73178 : 


MW; 820AC019B5C93987 CRC64; 



Query Match 9.9%; Score 294; DB 1; Length 664; 

Best Local Similarity 23.9%; Pred. No. 2.8e-13; 

Matches 127; Conservative 93; Mismatches 202; Indels 110; Gaps 23 

Qy 11 IIVFYLLILLVGIWAAWRTKNSGS7\£ERSEAIIVGGRDIGLLVGGFTMTATWVGGGY 67 

I :::::::: I I : I I : I I I : : I I : I : : I : : I I : 

Db 32 IVIYFWVMAVGLWAMFST-NRGTV GGFFLAGRSMVWWPIGASLFASNIGSGHFVG 86 

Qy 68 INGTAEAVYVPGYGLAWAQAP I G YS LS L I LGGLFFAKPMRS K- GYVTMLDP FQQI YGKRM 12 6 

: I I I : II | : : | | : | I : I I I I I : II 

Db 87 LAGT GAAAG I AT G G F EWN AL I L WL LGWVFV- - P I Y I KAGWTM PEYLRKRF 136 

Qy 127 GG LLFI PALMGEMFWAAAI FSALGATI SVI IDVDMHI SVI I SALI ATL 174 

II : I : I : : : I I I I : : : : I : : : : : I II 

Db 137 GGQRIQVYLSVLSLVLYI FTKISADI FSGAI F INLALGLDLYLAIFILLAITAL 190 

Qy 175 YTLVGGLYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDS 234 

I I : I I I : I I I I : I : : I : I II I : : I II : I I I 

Db 191 YTITGGLAAVIYTDTLQTVIMLLGSFILTGFAFHEVG GYSAFVTKYMNA-I PTVTS 245 

Qy 235 SEVYS-WLDSFLLL MLGGIPW QAYFQRVLSSS 265 

I I : I I I : : I : I I I I I I I : 

Db 246 YGNTTVKKECYTPRADS FHI FRDPLKGDLPWPGLI FGLTI I SLWYWCTDQVI VQRCLSAK 305 



Qy 2 66 SATYAQVLS FLAAFGCLVMAI PAI LI GAI GASTDWNQTAYGLPDPKTTEE AD 317 

: : : : : : I : : : I I : I : | | : : 

Db 306 NMSHVKAGCIMCGYMKLLPMFLMVMPGMISRILFTEKVACTV--PSECEKYCGTKVGCTN 363 

Qy 318 MI L P I VLQ YLC PVYI S FFGLGAVS AAVMS S ADS S I LS AS SMFARN I YQLS FRQNAS DKE I 377 

: I : : II : I : I : : I I I I I I I : : I : I I I : I I : I I : 

Db 364 IAYPTLWELMPNGLRGLMLSVMLAS LMSSLTSIFNSASTLFTMDIY-TKIRKKASEKEL 422 

Qy 378 VWVMRITVFV- FGASATAMALLTKTVYG — LWYLSSDLVYI--VI FPQLLCVLFVKGTNT 432 

: I : : I I I : : : I hi I : I I : I I I 

Db 423 MIAGRLFMLVLIGVSIAWVPIVQSAQSGQLFDYIQSITSYLGPPIAAVFLLAIFCKRVNE 4 82 

Qy 4 33 YGAVAGYVSGLFLRI TG GEPYLYLQPLIF 4 61 

II I : I : : II I I I I : : I 

Db 4 83 PGAFWGLIIGFLIGVSRMITEFAYGTGSCMEPSNCPTIICGVHYLYFAIILF 534 



RESULT 9 
SGLT_VIBPA 

ID SGLT_VIBPA STANDARD; PRT; 543 AA. 

AC P96169; 

DT 30-MAY-2000 (Rel. 39, Created) 

DT 30-MAY-2000 (Rel. 39, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Sodium/glucose cotransporter (Na (+) /glucose symporter) . 

GN SGLT. 

OS Vibrio parahaemolyticus . 

OC Bacteria; Proteobacteria; Gammaproteobacteria ; Vibrionales; 

OC Vibrionaceae; Vibrio. 

OX NCBI_TaxID=670; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=AQ3334; 

RX MEDLINE=9 6248401; PubMed=8 652595; 

RA Sarker R.I., Okabe Y., Tsuda M. , Tsuchiya T.; 

RT "Sequence of a Na+/glucose symporter gene and its flanking regions of 

RT Vibrio parahaemolyticus."; 

RL Biochim. Biophys . Acta 1281:1-4(1996). 

RN [2] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=2 04 00508; PubMed=10835424 ; 

RA Turk E., Kim 0., Le Coutre J., Whitelegge J. P., Eskandari S., 

RA Lam J.T., Kreman M. , Zampighi G. , Faull K.F., Wright E.M.; 

RT "Molecular characterization of Vibrio parahaemolyticus vSGLT: a model 

RT for sodium-coupled sugar cotransporters. "; 

RL J. Biol. Chem. 275:25711-25716(2000). 

RN [3] 

RP MASS SPECTROMETRY OF FORMYLATED FORM, AND REVISIONS TO N-TERMINUS. 

RX MEDLINE=20222957; PubMed=10757971 ; 

RA le Coutre J., Whitelegge J. P., Gross A., Turk E . , Wright E.M., 

RA Kaback H.R., Faull K.F.; 

RT "Proteomics on full-length membrane proteins using mass 

RT spectrometry. "; 

RL Biochemistry 39:4237-4242(2000). 

CC -!- FUNCTION: ACTIVELY TRANSPORTS GLUCOSE INTO CELLS BY NA(+) CO- 
CC TRANSPORT (BY SIMILARITY) . 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein. 



CC -!- MASS SPECTROMETRY: MW=6068 0; METHOD=Electrospray . 

CC -!- SIMILARITY: BELONGS TO THE SODIUM: SOLUTE SYMPORTER FAMILY (SSF) . 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; D78137; BAA11215.1; ALT_FRAME . 

DR EMBL; AF255301; AAF80602.1; -. 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR TIGRFAMs; TIGR00813; sss; 1. 

DR PROSITE; PS00456; NA_S0LUT_SYMP_1 ; 1. 

DR PROSITE; PS00457; NA_S0LUT_SYMP_2 ; FALSE_NEG. 

DR PROSITE; PS502 83; NA_S0LUT_SYMP_3 ; 1. 

KW Transport; Sugar transport; Transmembrane; Sodium transport; Symport. 



FT 


TRANSMEM 


10 


30 


POTENTIAL. 


FT 


TRANSMEM 


45 


65 


POTENTIAL. 


FT 


TRANSMEM 


79 


99 


POTENTIAL. 


FT 


TRANSMEM 


129 


149 


POTENTIAL. 


FT 


TRANSMEM 


156 


176 


POTENTIAL. 


FT 


TRANSMEM 


193 


213 


POTENTIAL. 


FT 


TRANSMEM 


246 


266 


POTENTIAL. 


FT 


TRANSMEM 


287 


307 


POTENTIAL. 


FT 


TRANSMEM 


345 


365 


POTENTIAL. 


FT 


TRANSMEM 


401 


421 


POTENTIAL. 


FT 


TRANSMEM 


427 


447 


POTENTIAL. 


FT 


TRANSMEM 


455 


475 


POTENTIAL. 


FT 


TRANSMEM 


483 


503 


POTENTIAL. 


FT 


TRANSMEM 


523 


543 


POTENTIAL. 


SQ 


SEQUENCE 


543 AA; 


58874 


MW; 61BE3F7E3 8 0BC32C 



Query Match 9.9%; Score 293.5; DB 1; Length 543; 

Best Local Similarity 25.1%; Pred. No. 2.4e-13; 

Matches 139; Conservative 96; Mismatches 198; Indels 121; Gaps 27; 

Qy 4 HVEGLIAIIVF — YLLILL-VGIWAAWRTKNSGSAEERSEAI IVGGRDIGLLVGGFTMTA 60 

I I I : I I | : I : : | | : I : : : : : : I : I : : I : : I 

Db 6 HGL S F I D IMVFAI YVAI 1 1 GVGLWV SRDKKGTQKSTEDYFLAGKSLPWWAVGASLIA 62 

Qy 61 TWV GGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGY 111 

: I I I I I I I II : : I I : I : I IS 

Db 63 ANISAEQFIGMSGSGYSIGLAIASY EWMSA ITLIIVGKYFLPIFIEKGI 111 



Qy 112 VTMLDP FQQI YGKRMGGLLFI PALMGEMFWAAA- 1 FSAL GAT I S VI I DVDMH I 163 

I : : : : : I : : : I : I I : II I I : I : : : 

Db 112 YTIPEFVEKRFNKKLKTILAV FWI SLYIFVNLTSVLYLGGLALETILGIPLMY 164 



Qy 164 S VI I SAL I AT L YT LVGGL Y S VAYT D WQL FC I FVGLWI S VP FAL S H P AVAD I G FTAVHAK 223 

I : : I I I : I : : I I 1 : I : I I I : I : I : : I I : : I I : I I I 
Db 165 SILGLALFALVYSIYGGLSAWWTDVIQVFFLVLG GFMTTYMAVSFIGGT 214 

Qy 224 YQKPWLGTV DSSEVYSWLDSFLLLMLGGIPW QAY 257 



Db 215 — DGWFAGVSKMVDAAPGHFEMILDQSNPQYMNLPG-IAVLIGGL-WANLYYWGFNQYI 270 

Qy 258 FQ RVL S S S SAT YAQVL S FLAAFGC LVMAI P AI L I G- AI GAS T DWNQT AYGL P D P KTT E — 314 

I I I : : I : I I INI:: : I I I I | | I | 

Db 271 IQRTLAAKSVSEAQKGIVFAAFLKLIVPFLWLPGIAAYVITSDPQLMASLGDIAATNLP 330 

Qy 315 EADMI LP I VLQ YLC P VYI S FFGLGAVS AAVMS S ADS S I L S AS SMFARN I YQL S FRQN 371 

II | : | : | | | : I : : I I : : I I I : I : : : I : I I : : 

Db 331 SAANADKAYPWLTQFL-PVGVKGVVFAALAAAIV^ 389 

Qy 372 AS DKE I VWVMRI TVFVFGASATAMALLTKTVYG — LWYL S S DLVYI VIFPQLLCV 424 

: I : : I I I : I : I I : : I : II : : I : I I 

Db 390 SGDHKLVNV GRTAAWALI I ACLI APMLGGI GQAFQ YIQEYTGLVS PGI LAV 441 

Qy 425 LFVKGTNTYGAVAGYVSGLFLRITGGEPY-LYLQPLIFYPGYYP-DDNGIYNQKFP 478 

I I I I : I I : I I : : I : I : I : II I I : I I 
Db 4 42 FLLGLFWKKTTSKGAIIGWASI PFALFLK FMP L SM P FMDQML YT L L FT 4 90 

Qy 479 FKTLAMVTSFLTNI 492 

: I II I : I 
Db 491 MWIAF-TSLSTSI 503 

RESULT 10 
SL52 HUMAN 



ID SL52_HUMAN STANDARD; PRT; 672 AA. 

AC P31639; 

DT 01-JUL-1993 (Rel. 26, Created) 

DT 01-JUL-1993 (Rel. 26, Last sequence update) 

DT 16-OCT-2001 (Rel. 40, Last annotation update) 

DE Sodium/glucose cotransporter 2 (Na (+) /glucose cotransporter 2) 

DE (Low affinity sodium-glucose cotransporter) . 

GN SLC5A2 OR SGLT2 . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Kidney; 

RX MEDLINE=93035768; PubMed=1415574 ; 

RA Wells R.G., Pajor A.M., Kanai Y. , Turk E . , Wright E.M., Hediger M.A. ; 

RT "Cloning of a human kidney cDNA with similarity to the sodium-glucose 

RT cotransporter. "; 

RL Am. J. Physiol. 263 : F459-F4 65 ( 1992 ) . 

CC -!- FUNCTION: Sodium-dependent glucose transporter. Has a Na+ to 
CC glucose coupling ratio of 1:1. 

CC -!- FUNCTION: Efficient substrate transport in mammalian kidney is 
CC provided by the concerted action of a low affinity high capacity 

CC and a high affinity low capacity Na (+) /glucose cotransporter 

CC arranged in series along kidney proximal tubules. 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein. 

CC -!- SIMILARITY: BELONGS TO THE SODIUM : SOLUTE SYMPORTER FAMILY (SSF). 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 



cc 


the European Bioinf ormatics 


Institute. There are no restrictions on 


its 


cc 


use by 


non-profit institutions as long as its content is in no 


way 


cc 


modified 


and this 


statement 


is not removed. Usage by and for commercial 


cc 


entities 


requires 


a license 


agreement (See http://www.isb-sib.ch/announce/ 


cc 
DR 


or send an email to license@isb-sib . ch) . 




EMBL; M95549; AAA36608.1; - 






U L\ 


PIR; A56765; A56765. 






UK 


Genew; HGNC: 11037; 


SLC5A2 . 








MIM; 1823 


81; -. 








UK 


GO; GO: 0016021; C: 


integral 


to membrane; TAS. 




UK 


GO; GO: 0005362; F: 


low-affinity glucose : sodium symporter activity; TAS . 




UK 


GO; GO: 0005975; P: 


carbohydrate metabolism; TAS. 




no 
UK 


GO; GO:0006810; P: 


transport 


; TAS. 




no 
UK 


InterPro; 


IPR001734; Na/solut symport. 




DR 

Ui\ 


Pfam; PF00474; SSF; 1. 






JJK 


TIGRFAMs; 


TIGR00813; sss; 1 


■ 




nn 
UK 


PROSITE; 


PS00456; 


NA_SOLUT 


SYMP_1; 1. 




nn 
UK 


PROSITE; 


PS00457; 


NA_SOLUT 


SYMP_2; 1. 




nn 
UK 


PROSITE; 


PS50283; 


NA SOLUT 


SYMP_3; 1. 




KW 


Transport 


; Sugar transport; 


Transmembrane; Sodium transport; Symport; 




VT.7 

AW 


Glycoprotein . 








rr 1 rp 


DOMAIN 


1 


25 


CYTOPLASMIC (POTENTIAL) . 




ITT 1 

r 1 


1 KAN bJyiJljjyi 




44 


POTENTIAL. 




T7 rp 

r i 


DOMAIN 


45 


61 


EXTRACELLULAR (POTENTIAL) . 




Tprn 


TRANSMEM 


62 


82 


POTENTIAL. 




r-irp 


DOMAIN 


83 


102 


CYTOPLASMIC (POTENTIAL) . 




r 1 


TRANSMEM 


103 


123 


POTENTIAL. 




T7>rn 


DOMAIN 


124 


168 


EXTRACELLULAR (POTENTIAL) . 




T-trp 


TRANSMEM 


169 


188 


POTENTIAL . 




r 1 


DOMAIN 


189 


205 


CYTOPLASMIC (POTENTIAL) . 




C 1 


TRANSMEM 


206 


226 


POTENTIAL. 




r i 


DOMAIN 


227 


270 


EXTRACELLULAR (POTENTIAL) . 




Tprn 

r 1 


TRANSMEM 


271 


291 


POTENTIAL. 




TTirn 

r 1 


DOMAIN 


292 


314 


CYTOPLASMIC (POTENTIAL) . 




TPrp 
£ 1 


TRANSMEM 


315 


334 


POTENTIAL. 




T7"Ti 


DOMAIN 


335 


423 


EXTRACELLULAR (POTENTIAL) . 




r I 


TRANSMEM 


424 


443 


POTENTIAL. 




TTirn 

r 1 


DOMAIN 


444 


455 


CYTOPLASMIC (POTENTIAL) . 




r 1 


TRANSMEM 


456 


476 


POTENTIAL. 




r 1 


DOMAIN 


477 


526 


EXTRACELLULAR (POTENTIAL) . 




r i 


TRANSMEM 


527 


547 


POTENTIAL. 




T7>rn 

r 1 


DOMAIN 


548 


650 


CYTOPLASMIC (POTENTIAL) . 




r J. 


TRANSMEM 


651 


671 


POTENTIAL. 




ITT 


CARBOHYD 


250 


250 


N-LINKED (GLCNAC. . .) (POTENTIAL). 




FT 


CARBOHYD 


399 


399 


N- LINKED (GLCNAC. . .) (POTENTIAL). 




FT 


SITE 


40 


40 


IMPLICATED IN SODIUM COUPLING 




FT 








(BY SIMILARITY) . 




FT 


SITE 


300 


300 


IMPLICATED IN SODIUM COUPLING 




FT 








(BY SIMILARITY) . 




SQ 


SEQUENCE 


672 AA; 


72896 MW; 233C65F1601B0337 CRC64; 





Query Match 9.8%; Score 292; DB 1; Length 672; 

Best Local Similarity 24.1%; Pred. No. 3.9e-13; 

Matches 147; Conservative 91; Mismatches 237; Indels 136; Gaps 22; 
Qy 8 LIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGY 67 



• • i ""ii"' i i • i • ii i i • I I i : : i : : i i : 

Db 2 6 ILVIAAYFLLVIGVGLWSMCRT-NRGTV GGYFLAGRSMVWWPVGASLFASNIGSGH 8 0 

Qy 68 I NGTAEAVYVP GYGLAWAQAP I G YS L S LILGGLFFAKPMRSKGYVTMLDPFQQIYG 123 

II I I I II:: :: I I I I : I : I I | 
Db 81 FVGLA GT GAAS GLAVAG FEWNAL FWL LL GWL FAP VYLT AGVI TM PQYLR 130 

Qy 124 KRMGG LLFI PALMGEMFWAAAIF — SAL GAT I S VI ID VDMH IS VI ISA 169 

II II I : I : : : I : I I I I I : III 

Db 131 KRFGGRRIRLYLSVLSLFLYIFTKISVDMFSGAVFIQQALGWNI YASVIALL 182 

Qy 17 0 L I AT L YT LVGGL Y S VAYT D WQ L FC I FVGLW I S VP FAL S H P AVAD I G FTAVHAK Y 224 

I : II : III : : III II I I I I : : I I : : : | | 

Db 183 GITMIYTVTGGLAALMYTDTVQTFVILGGACILMGYAFHEVG GYSGLFDKYLGAAT 238 

Qy 225 QKPWLGTVDSSEVYSWLDSFLLL MLGGIPW QAYF 258 

: I : I : I I I : II : I : I I | 

Db 239 SLTVSEDPAVGNISSFCYRPRPDSYHLLRHPVTGDLPWPALLLGLTIVSGWYWCSDQVIV 298 

Qy 259 Q RVL S S S SAT YAQ VL S FLAAFGC L VMAI P AI L I GAI GAS T DWN QT AYGL P D P KT TE 314 

I I I : I I : : I : I : : I I : : | : | : | | 

Db 299 QRCLAGKSLTHI KAGCI LCGYLKLTPMFLMVMPGMI SRILYPDEVACWPEVCRRVCGTE 358 

Qy 315 E- - ADMI LP I VLQYLC PVYI S FFGLGAVSAAVMS S ADS S I L S AS SMFARN I YQLS FRQNA 372 

: : : I : : II : I : I I : I I I I I : I : : I : I I I I 

Db 359 VGCSNIAYPRLWKLMPNGLRGLMLAVMLAALMSSLASIFNSSSTLFTMDIY-TRLRPRA 417 

Qy 373 SDKEIVWVMRI-TVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQLLCV LFV 427 

I : I :: I I : I I : | : : : | : | : | : | | | | 

Db 418 G D RE L L L VG RL WWF I VWS VAW L P WQ AAQ GGQ L FD Y I Q AVS S Y LAP P VS AVFVLAL FV 477 

Qy 428 KGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMV-- 485 

I I I I : I I : : I : I I : : | 

Db 478 PRVNEQGAFWGLIGGLLMGLARLIP EFSFGSGSCVQP 514 

Qy 486 TSFLTNICISYLAKYLFE-SGTLPPKLDVFDAW ARHSEENMDKTI 530 

: II : I I I I I I I : : I : II: 
Db 515 SAC PAFLCGVH YL YFAI VLFFC S GLLT LTVS LCTAPI PRKHLHRLVFS LRHS KE 568 

Qy 531 LVKNENIKLDE 541 

: I :: II 
Db 569 — EREDLDADE 577 

RESULT 11 
3L54 MOUSE 



ID SL54_MOUSE STANDARD; PRT; 656 AA. 

AC Q9ET37; 

DT 16-OCT-2001 (Rel. 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Low affinity sodium-glucose cotransporter (Sodium/glucose 

DE cotransporter 3) (Na (+) /glucose cotransporter 3). 

GN SLC5A4A. 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 



OX NCBI_TaxID=100 90; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=129/Sv; 

RX MEDLINE=20499361; PubMed=11042 14 6 ; 

RA Pletcher M.T., Roe B.A. , Chen F. , Do T., Do A., Malaj E. , Reeves R.H.; 

RT "Chromosome evolution: the junction of mammalian chromosomes in the 

RT formation of mouse chromosome 10."; 

RL Genome Res. 10:1463-1467(2000). 

CC -!- FUNCTION: Sodium-dependent glucose transporter (By similarity). 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein. 

CC -!- SIMILARITY: BELONGS TO THE SODIUM: SOLUTE SYMPORTER FAMILY (SSF). 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; AF251267; AAG01741.1; -. 

DR MGD; MGI: 1927848; Slc5a4a. 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR TIGRFAMs; TIGR00813; sss; 1. 

DR PROSITE; PS00456; NA_SOLUT_SYMP_l ; FALSE_NEG. 

DR PROSITE; PS00457; NA_SOLUT_SYMP_2 ; 1. 

DR PROSITE; PS50283; NA_SOLUT_SYMP_3 ; 1. 

KW Transport; Sugar transport; Transmembrane; Sodium transport; Symport; 

KW Glycoprotein. 
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FT 


DOMAIN 


230 


270 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


271 


291 


POTENTIAL. 
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FT 


TRANSMEM 


315 


334 


POTENTIAL. 
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POTENTIAL. 
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CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 
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655 


POTENTIAL. 


FT 


CARBOHYD 


248 


248 


N-LINKED (GLCNAC. . .) (POTENTIAL) 


SQ 


SEQUENCE 


656 AA; 


71837 


MW; A6668E815204D39B CRC64; 



Query Match 



9.8%; Score 290; DB 1; Length 656; 



Best Local Similarity 22.3%; Pred. No. 5.2e-13; 

Matches 147; Conservative 102; Mismatches 252; Indels 158; Gaps 25; 



Qy 


11 


IIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGYING 


70 


Db 


32 


1:::::::: l|:|| :| I : || : 1 :: |: :| |: | 

I VI YFVVVMAVGVWAMLKTNRSTVG GFFLAGRSMTWWPMGAS LFASN I GS GH FVG 


86 


Qy 


71 


TAEAVTVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSK-GYVTMLDPFQQI YGKRMGG- 

1 1 : 1 : : | | : | | : | I : | | : | | : | | | | 
LAGTGAASGIAVT-AFESHSFALLLVLGWIFV — PIYIKAGVMTM PEYLKKRFGGK 


128 


Db 


87 


139 


Qy 


129 


LLFI PALM - -GEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALIATLYT 


176 


Db 


140 


III : : : : I : | | | I :::::::::: | | : : | 
RLQIYLSILFLFICVILTISADIF-SGAIF 1 KLALGLNLYLAI LI LLAITAI FT 


192 


Qy 


177 


LVGGL Y S VAYT DWQ L FC I FVGLW I S VP FAL S H P AVAD I G FT AVHAK YQ K PW L GT VD 

• 1 1 M : II : 1 : 1 1 I I : : I : | : 

I TGGLAS VI YT DTVQAVIMLVGS FI LMVFAF VEVGGYESFTEKFMNAI PSWEGDN 


233 


Db 


193 


248 


Qy 


234 


SSEVYS-WLDSFLLL MLGGIPW QAY FQRVL S S S SAT 

Ml: III: : 1 1 1 1 | | | | : : 
LTINSRCYTPQPDSFHIFRDPVTGDIPWPGTAFGMPITALWYWCINQVIVQRCLCGKNLS 


268 


Db 


249 


308 


Qy 


269 


YAQVLSFLAAFGCLVMAI PAILIGAIGASTDWNQTAYGLP DPKTTEEADMI 


319 


Db 


309 


: : 1 : 1 : : : 1 1 : I : | III 

H VKAAC I L C G Y L KL L P L F FMVMP GMI S R I L YT DMVAC WP S E C VKH C G VD VG C TN YA 


365 


Qy 


320 


LPIVXQYLCPWISFFGLGAVSAAViyiSSADSSILSASSMFARNIYQLSFRQNASDKEIVW 

Is:: II : 1 : 1 : : 1 1 1 1 1 1 1 : : 1 : : 1 1 : I I : : | : : 
YPMLVLKLMPPGLRGLMLSVMLASLMSSLTSVFNSASTLFTIDLY-TKIRKKASERELLI 


379 


Db 


366 


424 


Qy 


380 


VMRITVFVFGASATAMALLTKTVYG LWYLSSDLVYI — VI FPQLLCVLFVKGTNTYG 


434 


Db 


425 


1 : 1 1 : : : : 1 : 1 : 1 : 1 1 : I I I I 
AGRLFVSVLIVTSILWVPIVEVSQGGQLVHYTEAISSYLGPPIAAVFLVAVFCKRANEQG 


484 


Qy 


435 


AVAGYVSGLFL RITGGEPYLYLQPLI FYPGYYPDDNG 


471 


Db 


T O *J 


1 1 : II : :| 1 II! ::|: 
AFWGLMVGLVMGLIRMIAEFSYGTGSCLAPSSCPKIICGVHYLYFAIILFF 


535 


Qy 


472 


IYNQKFPFKTLAMVTSFLTNICISYLAKYLFESGTLPPKLDV FDAWARH S EENMD 

1 : : 1 1 1 1 III : I 1 : 1 
VCILVILGVSYLTK PIPDVHLHRLCWALRNSKEERID 


527 


Db 


536 


572 


Qy 


528 


KTILVKNEN 1 KLDELALVKPR Q SMT L S S T FTN KEAFL DVD S S P 570 


Db 


573 


1 M : : 1 Ml 1 1 : 1 1 1 1 

LDAEDKEENGADDRTEEDQTEKPRGCLKKTCDLFCGLQRAEFKLTKVEEEALTDTTEKE 631 



RESULT 12 
SL53_MOUSE 

ID SL53_MOUSE STANDARD; PRT ; 718 AA. 

AC Q9JKZ2; 

DT 16-OCT-2001 (Rel. 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 16-OCT-2001 (Rel. 40, Last annotation update) 

DE Sodium/myo-inositol cotransporter (Na ( + ) /myo-inositol cotransporter ) . 
GN SLC5A3. 



OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 
RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=20237552; PubMed=10773690 ; 

RA McVeigh K.E., Mallee J. J., Lucente A., Barnoski B.L., Wu S., 

RA Berry G.T. ; 

RT "Murine chromosome 16 telomeric region, homologous with human 

RT chromosome 21q22, contains the osmoregulatory Na (+) /myo-inositol 

RT cotransporter (SLC5A3) gene."; 

RL Cytogenet. Cell Genet. 88:153-158(2000). 

CC -!- FUNCTION: Prevents intracellular accumulation of high 

cc concentrations of myo-inositol (an osmolyte) that result in 

CC impairment of cellular function. 

CC -.!- SUBCELLULAR LOCATION: Integral membrane protein. 

CC -!- SIMILARITY: BELONGS TO THE SODIUM: SOLUTE SYMPORTER FAMILY (SSF). 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

cc 

DR EMBL; AF220915; AAF43668.1; 

DR MGD; MGI: 1858226; Slc5a3. 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR TIGRFAMs; TIGR00813; sss; 1. 

DR PROSITE; PS004 56; NA_SOLUT_SYMP_l ; 1. 

DR PROSITE; PS 004 57; NA_SOLUT_SYMP_2 ; 1. 

DR PROSITE; PS50283; NA_SOLUT__SYMP_3 ; 1. 

KW Transport; Transmembrane: Sodium t-.ran.qnnrt-: .q\/mnnrt-- rJu^r^fair, 
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SITE 


24 


24 


IMPLICATED IN SODIUM COUPLING 


FT 








(BY SIMILARITY) . 


FT 


SITE 


285 


285 


IMPLICATED IN SODIUM COUPLING 


FT 








(BY SIMILARITY) . 


SQ 


SEQUENCE 


718 AA; 


79554 


MW; D035CFBECDDA803B CRC64; 



Query Match 9.7%; Score 289; DB 1; Length 718; 

Best Local Similarity 21.7%; Pred. No. 6.8e-13; 

Matches 150; Conservative 113; Mismatches 209; Indels 218; Gaps 32; 

Qy 9 IAII-VFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGG- 66 

II:: ::::!:: :| :| |:: I : I : I :| III I 

Db 10 IAWALYFILVMCIGFFAMWKSNRSTVS GYFLAGRSM — TWVAIGA 53 

Qy 67 — YIN GT AEAVYVP G YG LAWAQ AP I G Y S L S L I LGGL F FAK PMRS KGYVTM 114 

: : : :: I I I : I I : : I : | | : | : I I I I I 

Db 54 SLFVSNIGSEHFI GLAGSGAASGFAVGAWEFNALLLLQLLGWVFIPI YIRS-GVYTM 109 

Qy 115 LDPFQQIYGKRMGG LLFI PALMGEMFWAAAI F SAL GAT I S VI I DVDMH 162 

: II I I I I : I : : : I : I I : : : : 

Db 110 PEYLSKRFGGHRIQVYFAALSLLLYIFTKLSVDLYSGALF IQESLGWNLY 159 

Qy 163 ISVIISALIATLYTLVGGLYSVAYTDWQLFCIFVG LWISV PFAL 207 

: I I I : : I I : I I I : I I I I : I : : I : I I : : I 

Db 160 VS VI LLI GMTALLTVTGGLVAVI YTDTLQALLMI I GALTLMVT SMVKI GGFEEVKRRYML 219 

Qy 208 SHPAVADI GFTAVHAKYQK PWLGTV DSSEVYSWLDS 243 

: I II I III Ml: : I : I 

Db 220 ASPDVASILLKYNLSNTNACMVHPKANALKMLRDPTDEDVPWPGFILGQTPASWYWCAD 279 

Qy 244 FLLLMLGGI PWQAYFQRVL S S S S AT YAQ VLS FLAAFGCLVMAI PAI L 290 

I MM::: : | : : | | : : : | : : 

Db 28 0 QVI VQRVLAAKNIAHAKGSTLMAGFLKLLPMFI I WPGMI SRI VFADEI 32 8 

Qy 291 IGAIGASTDWNQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSA 342 

: I : : II I : : III: : : | 

Db 329 ACINPEHCMQVCGSRAGCSNIAY PRLVMTLVPVGLRGLMMAVMIA 373 

Qy 343 AVMS S ADS S I LSAS SMFARNI YQL S FRQNAS DKEI VWVMRI TV- FVFGASATAMALLT KT 401 

I : I I II I I I : : I : : I : I I : : II : I : : I M I I : I : : : : 

Db 374 ALMSDLDS I FNSASTI FTLDVYKL- IRKSASSRELMI VGRI FVAFMWI SI AWVPI IVEM 432 

Qy 402 VYGLWYLSSDLVYIVIFPQL LCVL FVKGTNT YGAVAGYVSG LFLRITGG 450 

IN I : I : I : I I I II : I I : I I I I I 

Db 433 QGGQMYLYIQEVADYLTPPVAALFLLAI FWKRCNEQGAFYGGMAGFVLGAVRLILAFTYR 4 92 

Qy 451 EP YLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICI 494 

I I : I : : I : : | : | : : 

Db 493 APECDQPDNRPGFIKDIHYMYVATALFW ITGLIT-VIV 529 

Qy 4 95 SYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILV KNENIKLDELALVK-P 54 7 

II I II I I : II:: hi I: I :::: 

Db 530 SLL TPPPTKDQI RTTTFWSKKTLVTKESCSQKDEPYKMQEKSILQCS 576 



Qy 54 8 RQSMTLSSTFTNKEAFLDVDSSPEGSGTED 577 

I : I I I :: : I : I It 

Db 577 ENSEVI SHTI PNGKS EDSIKGLQPED 602 



RESULT 13 
OPUE_BACSU 

ID OPUE_BACSU STANDARD; PRT; 492 AA. 

AC 006493; 

DT 30-MAY-2000 (Rel. 39, Created) 

DT 30-MAY-2000 (Rel. 39, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Osmoregulated proline transporter ( Sodium/proline symporter) . 

GN OPUE OR BSU06660. 

OS Bacillus subtilis. 

OC Bacteria; Firmicutes ; Bacillales; Bacillaceae; Bacillus. 

OX NCBI_TaxID=1423; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=168 / JH642; 

RA von Blohn C, Kempf B., Kappes R.M., Bremer E.; 

RT "Osmostress response in Bacillus subtilis: characterization of a 

RT proline uptake system (OpuE) regulated by high osmolarity and the 

RT alternative transcription factor sigma B."; 

RL Mol. Microbiol. 25:175-187(1997). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN=168; 

RX MEDLINE=97124186; PubMed=8 9694 99 ; 

RA Borriss R. , Porwollik S., Schroeter R. ; 

RT "The 52 degrees-55 degrees segment of the Bacillus subtilis 

RT chromosome: a region devoted to purine uptake and metabolism, and 

RT containing the genes cotA, gabP and guaA and the pur gene cluster 

RT within a 34960 bp nucleotide sequence."; 

RL Microbiology 142:3027-3031(1996). 

RN [3] 

RP SEQUENCE FROM N.A. 

RC STRAIN=168; 

RX MEDLINE=98044033; PubMed=938 4377 ; 

RA Kunst F. , Ogasawara N., Moszer I., Albertini A.M., Alloni G., 

RA Azevedo V., Bertero M.G., Bessieres P., Bolotin A., Borchert S., 

RA Borriss R., Boursier L., Brans A., Braun M. , Brignell S.C., Bron S 

RA Brouillet S., Bruschi C.V., Caldwell B., Capuano V., Carter N.M., 

RA Choi S.K., Codani J. J., Connerton I.F., Cummings N.J., Daniel R.A. 

RA Denizot F., Devine K.M., Dusterhoft A., Ehrlich S.D., Emmerson P.T 

RA Entian K.D., Errington J., Fabret C, Ferrari E . , Foulger D., 

RA Fritz C, Fujita M. , Fujita Y. , Fuma S., Galizzi A., Galleron N., 

RA Ghim S.Y., Glaser P., Goffeau A., Golightly E.J., Grandi G. , 

RA Guiseppi G. , Guy B.J., Haga K. , Haiech J., Harwood C.R., Henaut A. 

RA Hilbert H., Holsappel S., Hosono S., Hullo M.F., Itaya M. , Jones L 

RA Joris B., Karamata D., Kasahara Y., Klaerr-Blanchard M., Klein C, 

RA Kobayashi Y., Koetter P., Koningstein G., Krogh S., Kumano M. , 

RA Kurita K. , Lapidus A., Lardinois S., Lauber J., Lazarevic V., 

RA Lee S.M., Levine A., Liu H., Masuda S., Mauel C, Medigue C, 

RA Medina N., Mellado R.P., Mizuno M. , Moestl D., Nakai S., Noback M. 

RA Noone D., O'Reilly M. , Ogawa K., Ogiwara A., Oudega B. , Park S.H., 



RA Parro V., Pohl T.M., Portetelle D., Porwollik S., Prescott A.M., 

RA Presecan E. , Pujic P., Purnelle B . , Rapoport G., Rey M. , Reynolds S., 

RA Rieger M. , Rivolta C, Rocha E., Roche B., Rose M. , Sadaie Y. , 

RA Sato T., Scanlan E., Schleich S., Schroeter R. , Scoffone F . r 

RA Sekiguchi J., Sekowska A. r Seror S.J., Serror P., Shin B.S., Soldo B., 

RA Sorokin A., Tacconi E., Takagi T., Takahashi H., Takemaru K. , 

RA Takeuchi M. , Tamakoshi A. , Tanaka T., Terpstra P., Tognoni A., 

RA Tosato V., Uchiyama S., Vandenbol M. , Vannier F . , Vassarotti A. , 

RA Viari A., Wambutt R. , Wedler E . , Wedler H., Weitzenegger T., 

RA Winters P., Wipat A. , Yamamoto H., Yamane K., Yasumoto K. , Yata K. , 

RA Yoshida K., Yoshikawa H.F., Zumstein E., Yoshikawa H., Danchin A. ; 

RT "The complete genome sequence of the Gram-positive bacterium Bacillus 

RT subtilis."; 

RL Nature 390:249-256(1997). 

CC FUNCTION: CATALYZES THE SODIUM- DEPENDENT UPTAKE OF EXTRACELLULAR 

CC PROLINE. 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein. 

CC -!- SIMILARITY: BELONGS TO THE SODIUM : SOLUTE SYMPORTER FAMILY (SSF). 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; U92466; AAB66512.1; -. 

DR EMBL; AF011545; AAB72182.1; 

DR EMBL; Z99107; CAB12486.1; 

DR PIR; H69670; H69670. 

DR SubtiList; BG12641; opuE . 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR TIGRFAMs; TIGR00813; sss; 1. 

DR PROSITE; PS00456; NA_S0LUT_SYMP_1 ; 1. 

DR PROSITE; PS00457; NA_S0LUT_SYMP_2 ; 1. 

DR PROSITE; PS502 83; NA__S0LUT__SYMP_3 ; 1. 

KW Transport; Transmembrane; Sodium transport; Symport; 



KW 
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POTENTIAL. 


FT 


TRANSMEM 


365 


385 


POTENTIAL. 


FT 


TRANSMEM 


394 


414 


POTENTIAL. 


FT 


TRANSMEM 


424 


444 


POTENTIAL. 


FT 


TRANSMEM 


449 


469 


POTENTIAL . 



SQ SEQUENCE 492 AA; 53282 MW; 2 3 4 59 8 7 3F1E7 99E6 CRC64; 



Query Match 9.6%; Score 285; DB 1; Length 492; 

Best Local Similarity 22.1%; Pred. No. 8.5e-13; 

Matches 118; Conservative 97; Mismatches 214; Indels 106; Gaps 18; 



QY 5 VEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVG 64 

: I : I ::::::: I I : I : I : I : : : | | | : | | s | s 

Db 3 IEIIISLGIYFIAMLLIGWYAFKKTTDIND YMLGGRGLGPFVTALSAGAADMS 55 

QY 65 GGYINGTAEAVYVPGYGLAWAQAPI GYSLSLILGGLFFAKPMRSKGYVTMLDPFQQI 121 

1:1 I = ' I I : II hi I : : I : I I : 

Db 56 GWMLMGVPGAMFATGLS T LWLALGLT I GAYSN YLLLAPRLRAYT EAADDAI T I P D FFDKR 115 

Qy 122 YGKRMGGLLFI PALMGEMFWAAAI FSAL GAT I SVI I DVDMHI SVI I SALI ATLYTLV 178 

: I : I I : : I : | : | | : : s : s I I I I 

Db 116 FQHSSSLLKIVSALIIMIFFTLYTSSGMVSGGRLFESAFGADYKLGLFLTTAVWLYTLF 175 

Qy 179 GGLYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDSSEVY 238 

I I : I : II I I : I I : I I : II I | : | : : 

Db 176 GGFLAVSLTDFVQGAIMFAAL-VLVPI VAFT — HVGGVAPTFHEIDAVNPH 223 

Qy 239 SWLDSF LLLMLGGI PWQAYFQRVLS SS SATYAQ VLSFLA 27 7 

III : : : : : | : | | : : | : | 

Db 224 - LLDI FKGASVI S 1 1 S YLAWGLGY YGQPHI I VRFMAIKDI KDLKPARRIG 272 

Qy 278 AFGCLVMAI PAI LI GAI GASTDWNQTAYGLP DPKTT EEADMI LP I VLQ YLC PVYI S FFGL 337 

: : : : : I I I I II : : : I I I : | I : I I 

Db 273 MSWMIITVLGSVLTGLIG VAYAHKFGVAVKDPEMI FI IFSKILFHPLITGFLL 325 

Qy 338 GAVSAAVMS S AD S S I L S AS SMFARN I YQL S FRQNAS D KE I VWVMRI T VFVFGAS ATAMAL 397 

I : I I : I I I 1:1 : I : : I : I I : I I I I I : I : I : : I I | : : | 

Db 326 SAI LAAIMS S I S SQLLVTASAVTEDLYRSFFRRKAS DKELVMI GRLS VLVIAVI AVLLS L 385 

Qy 398 LTKTVYGLWYLSSDLVYIVIF PQLLCVLFVKGTNTYGAVAGYVSG LF 444 

: I : : : I : I : | I : I I : I I : I : | : 

Db 386 NP N S T I L D LVG YAWAG FGS AFG PAI LL S L YWKRMNEWGALAAMI VGAAT VL 436 

Qy 445 LRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLAK 499 

: I I I I : I : I : I : I : I : | 

Db 437 IWITTG LAKSTGVY-EIIP GFI LSMIAGI I VSMITK 471 



RESULT 14 
SL53_CANFA 

ID SL53_CANFA STANDARD; PRT; 718 AA. 

AC P31637; 

DT 01-JUL-1993 (Rel. 26, Created) 

DT 01-JUL-1993 (Rel. 26, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 

DE Sodium/myo-inositol cotransporter (Na ( +) /myo-inositol cotransporter) . 

GN SLC5A3 OR SMIT. 

OS Canis familiaris (Dog) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
OC Mammalia; Eutheria; Carnivora; Fissipedia; Canidae; Canis. 
OX NCBI_TaxID=9615; 
RN [1] 

RP SEQUENCE FROM N.A. 
RC TISSUE=Kidney; 

RX MEDLINE=92210609; PubMed=1372 904 ; 

RA Kwon H.M., Yamauchi A. , Uchida S., Preston A.S., Garcia-Perez A., 
RA Burg M.B., Handler J.S.; 



RT 


"Cloning 


of the cDNA for a 


Na+/myo-inositol cotransporter, a 


RT 


hypertonicity stress protein."; 


RT. 


J. Biol. 


Chem. 267:6297-6301(1992). 


rr 


-!- FUNCTION: Prevents intracellular accumulation of high 


CC 


concentrations of myo-inositol (an osmolyte) that result in 


cc 


impairment of 


cellular 


function . 


cc 


-!- SUBCELLULAR LOCATION: Integral membrane protein. 




TISSUE SPECIFICITY: Brain and kidney. 


rr 


-!- INDUCTION: Medium hypertonicity. 


rr 
rr 
rr 


-!- SIMILARITY: BELONGS TO 


THE SODIUM:SOLUTE SYMPORTER FAMILY (SSF). 


This SWISS-PROT entry is copyright. It is produced through a collaboration 


CC 


between 


the Swiss Institute of Bioinf ormatics and the EMBL outstation - 


CC 


the European Bioinf ormatics 


Institute. There are no restrictions on its 




use by 


non-profit institutions as long as its content is in no way 


CC 


modified 


and this 


statement 


is not removed. Usage by and for commercial 




entities 


requires 


a license 


agreement (See http://www.isb-sib.ch/announce/ 


CC 

cc 

UK 


or send an email 


to license@isb-sib . ch) . 


EMBL; M85068; -; 


NOT ANNOTATED CDS. 


UK 


PIR; A42163; A42163. 




UK 


InterPro; 


IPR001734; Na/solut symport. 


DR 


Pfam; PF00474; SSF; 1. 




UK 


TIGRFAMs; 


TIGR00813; sss; 1 




UK 


PROSITE; 


PS00456; 


NA SOLUT 


SYMP 1; 1. 


UK 


PROSITE; 


PS00457; 


NA SOLUT 


SYMP_2; 1. 


UK 


PROSITE; 


PS50283; 


NA SOLUT 


SYMP 3; 1. 


J\W 


Transport; Transmembrane; Sodium transport; Symport; Glycoprotein. 


T7>m 

r 1 


DOMAIN 


1 


9 


CYTOPLASMIC (POTENTIAL) . 


r 1 


TRANSMEM 


10 


29 


POTENTIAL. 


t 1 


DOMAIN 


30 


38 


EXTRACELLULAR (POTENTIAL) . 


T7""P 

r 1 


TRANSMEM 


39 


57 


POTENTIAL. 




DOMAIN 


58 


86 


CYTOPLASMIC (POTENTIAL) . 


r 1 


TRANSMEM 


87 


110 


POTENTIAL. 


r 1 


DOMAIN 


111 


123 


EXTRACELLULAR (POTENTIAL) . 


t 1 


TRANSMEM 


124 


144 


POTENTIAL. 


ttuti 

r I 


DOMAIN 


145 


157 


CYTOPLASMIC (POTENTIAL) . 


TTT 1 
C 1 


TRANSMEM 


158 


183 


POTENTIAL. 


T7>m 

r 1 


DOMAIN 


184 


186 


EXTRACELLULAR (POTENTIAL) . 


r 1 


TRANSMEM 


187 


205 


POTENTIAL. 


-rpr-p 

r 1 


DOMAIN 


206 


303 


CYTOPLASMIC (POTENTIAL) . 


r I 


TRANSMEM 


304 


324 


POTENTIAL. 


r 1 


DOMAIN 


325 


353 


EXTRACELLULAR (POTENTIAL) . 


r 1 


TRANSMEM 


354 


376 


POTENTIAL. 


rrrp 


DOMAIN 


377 


406 


CYTOPLASMIC (POTENTIAL) . 


Tr>rp 


TRANSMEM 


407 


430 


POTENTIAL. 


r 1 


DOMAIN 


431 


443 


EXTRACELLULAR (POTENTIAL). 


r l 


TRANSMEM 


444 


462 


POTENTIAL. 


r l 


DOMAIN 


463 


510 


CYTOPLASMIC (POTENTIAL) . 


r l 


TRANSMEM 


511 


532 


POTENTIAL. 


r l 


DOMAIN 


533 


695 


CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


696 


716 


POTENTIAL. 


FT 


CARBOHYD 


32 


32 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


SITE 


24 


24 


IMPLICATED IN SODIUM COUPLING 


FT 








(BY SIMILARITY) . 


FT 


SITE 


285 


285 


IMPLICATED IN SODIUM COUPLING 


FT 








(BY SIMILARITY) . 



SQ SEQUENCE 718 AA; 79545 MW; 4C1B5CC44 85CD268 CRC64; 

Query Match 9.4%; Score 278.5; DB 1; Length 718; 

Best Local Similarity 23.1%; Pred. No. 3.7e-12; 

Matches 154; Conservative 115; Mismatches 222; Indels 177; Gaps 34; 

Qy 9 I AI I - VFYLLI LLVGI WAAWRTKNSGSAEERS EAI I VGGRDI GLLVGGFTMTATWVGGG- 66 

III: : I : I I : : I : I : I :| III I 

Db 10 IAIVALYFILVMCIGFFAMWKSNRSTVS GYFLAGRSM--TWVAIGA 53 

Qy 67 - - YI NGTAEAVYVP G YGLAWAQAP I G YS LSLILGGLFFAKPMRSKGYVTM 114 

: : : : : I I I : I I : : I : I I : I : I I I I I 

Db 54 SLFVSNIGSEHFI G LAG S G AAS G FAVG AW E FN AL L L L Q L L GWVF I P I Y I R S - G VY TM 109 

Qy 115 LDPFQQIYGKRMGG LLFI PALMGEMFWAAAI FSALGATI SVI IDVDMH 162 

: II II : I : I : : : I : I I : : : : 

Db 110 PEYLSKRFGGHRIQVYFAALSLILYIFTKLSVDLYSGALF IQESLGWNLY 159 

Qy 163 I SVI I SALI ATLYTLVGGLYSVAYTDWQLFCI FVG LWISV PFAL 207 

: I I I : : I I : I I I : I I I I : I : I I : I I : : | 

Db 160 VSVILLIGMTALLTVTGGLVAVIYTDTLQALLMIVGALTLMIISMMEIGGFEEVKRRYML 219 

Qy 208 SHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSWLDSFLLL MLGGIP- 253 

: I I I I I | : | | : | : : | :|| | 

Db 220 ASPNVTSILLT YN LSNTNSCNVHPKKDALKMLRNPTDEDVPWPGFVLGQTPA 271 

Qy 254 W QAYFQRVLS S S SATYAQ VLS FLAAFGCLVMAI PAI L 290 

I I MM::: : | : : | | : : : | : : 

Db 272 SWYWCADQVIVQRVLAAKNIAHAKGSTLMAGFLKLLPMFIIWPGMISRILFADDIACI 331 

Qy 291 IGAIGASTDWNQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVM 345 

: I : : I I I : : I II : : : I I : I 

Db 332 NPEHCMQVCGSRAGCSNIAY PRLVMKLVPVGLRGLMMAVMIAALM 376 

Qy 346 S SADS S I LSAS SMFARNI YQLS FRQNASDKEI VWVMRITV- FVFGASATAMALLTKTVYG 404 

I II I M : : I : : I : I I : : I I : I : : I II | | : I : : : : | 
Db 377 S DLDS I FNS AST I FTLDVYKL- 1 RRS AS S RELMI VGRI FVAFMWI S I AWVP 1 1 VEMQGG 435 

Qy 405 LWYLS S DLVYI VI FPQL LCVLFVKGTNT YGAVAGYVSGLFLRITGGEPYLYL 456 

II I : I : I : I I I II : M : I I : I : I : I 

Db 436 QMYLYIQEVADYLTPPVAALFLLAIFWKRCNEQGAFYGGMAGFVLGA-VRLT — LAFAYR 492 

Qy 457 QPLIFY PGYYPDDNGI YNQKFPFKTLAMVTSFLTNICISYLAKYLFESGTLPPKLD 512 

I M : I : : I I II : I : : I I III: 
Db 4 93 APECDQPDNRPGFIKDIHYMY VATALFWVTGLIT-VIVSLL TPPPTKE 539 

Qy 513 VFDAWARH S EENMDKT I LVKNEN I KLDELALVKP RQSMTLS S T FTNKEAFLDVD S S PEG 572 

I : I : : : I I II :: : : I Ml : II 

Db 540 QI RTTTFWSKKSLWKESCSPKDEPYKMQEKSILRCSE NSEATNHI — IPNG 589 

Qy 573 SGTEDNLQ 580 

: I I : : : 

Db 590 K-SEDSIK 596 

RESULT 15 
SL53 BOVIN 



ID SL53_BOVIN STANDARD; PRT; 718 AA. 

AC P53793; 

DT 01-OCT-1996 (Rel. 34, Created) 

DT 01-OCT-1996 (Rel. 34, Last sequence update) 

DT 15-JUL-1998 (Rel. 36, Last annotation update) 

DE Sodium/myo-inositol cotransporter (Na (+ ) /myo-inositol cotransporter) . 

GN SLC5A3 OR SMIT. 

OS Bos taurus (Bovine) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Cetartiodactyla ; Ruminantia; Pecora; Bovoidea; 

OC Bovidae; Bovinae; Bos. 

OX NCBI_TaxID=9913; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Mallee J. J., Parrella T., Kwon H.M., Berry G.T.; 

RL Submitted (APR-1996) to the EMBL/ GenBank/DDBJ databases. 

CC -!- FUNCTION: Prevents intracellular accumulation of high 

CC concentrations of myo-inositol (an osmolyte) that result in 

CC impairment of cellular function. 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein. 

CC -I- SIMILARITY: BELONGS TO THE SODIUM : SOLUTE SYMPORTER FAMILY (SSF). 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; U41338; AAA93188.1; -. 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR TIGRFAMs; TIGR00813; sss; 1. 

DR PROSITE; PS00456; NA_SOLUT_SYMP_l ; 1. 

DR PROSITE; PS00457; NA_S0LUT_SYMP_2 ; 1. 

DR PROSITE; PS50283; NA_S0LUT_SYMP_3 ; 1. 

Transport; Transmembrane; Sodium transport; Symport; Glycoprotein. 
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POTENTIAL. 
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CYTOPLASMIC (POTENTIAL) . 
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POTENTIAL. 


FT 
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FT 
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POTENTIAL. 


FT 
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FT 
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POTENTIAL. 
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KU1 EN 1 1AL . 


r 1 


DOMAIN 


A C O 

4 bo 
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LAKbUn I u 






N-LINKED (GLCNAC. . .) (POTENTIAL) 


FT 


SITE 


24 


24 


IMPLICATED IN SODIUM COUPLING 


FT 








(BY SIMILARITY) . 


FT 


SITE 


285 


285 


IMPLICATED IN SODIUM COUPLING 


FT 








(BY SIMILARITY) . 


SQ 


SEQUENCE 


718 AA; 


79673 


MW; 206BE25FA385111D CRC64; 



Query Match 9.3%; Score 275; DB 1; Length 718; 

Best Local Similarity 22.2%; Pred. No. 6.6e-12; 

Matches 148; Conservative 122; Mismatches 225; Indels 172; Gaps 32; 

Qy 9 IAII-VFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGG- 66 

III: : I : I I : : I : I : I : I I I I I 

Db 10 IAIVALYFILVMCIGFFAMWKSNRSTVS GYFLAGRSM — TWVAIGA 53 

Qy 67 — YINGTAEAVYVPGYGLAWAQAPIGYS LSLILGGLFFAKPMRSKGYVTM 114 

: : : : : I I I : I I : : I : I I : I : I I I II 

Db 54 SLFVSNIGSEHFI GLAGSGAASGFAVGAWEFNALLLLQLLGWVFI PIYIRS-GVYTM 109 

Qy 115 LDPFQQI YGKRMGG LLFI PALMGEMFWAAAI FSALGATI SVI I DVDMH 162 

: I I I I : I : I : : : I : I I : : : : 

Db 110 PEYLSKRFGGHRIQVYFAALSLILYIFTKLSVDLYSGALF IQESMGWNLY 159 

Qy 163 I SVI I SALIATLYTLVGGLYSVAYTDWQLFCI FVG LWISV PFAL 207 

: I I I : : I I : I I I : I I I I : I : I I : I I : : I 

Db 160 VSVI LLI GMTALLTVTGGLVAVI YTDTLQALLMI VGALTLMVI SMMEI GGFEEVKRRYML 219 

Qy 208 SHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSWLDSFLLL MLGGIP- 253 

: I I I I I | : | | : | : : | : I I I 

Db 22 0 ASPNVTSILLT YN LSNTNSCNVHPKKDALKMLRNPTDEDVPWPGFILGQTPA 271 

Qy 254 W QAYFQRVLSSSSATYAQ VLS FLAAFGCLVMAI PAI L 290 

I I MM::: : | : : | | : : : | : : 

Db 272 SWYWCADQVIVQRVTJWCNIAHAKGSTLMAGFLKLLPMFIIWPGMISRILFADDIACI 331 

Qy 291 IGAI GASTDWNQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVM 345 

: I : : I I I : : I II : : : II : I 

Db 332 NPEHCMQVCGSRAGCSNIAY PRLVMKLVPVGLRGLMMAVMIAALM 376 

Qy 346 S SADS S I LSAS SMFARNI YQLS FRQNASDKEI VWVMRITV- FVFGASATAMALLTKTVYG 404 

I II I I I : : I : : I : I I : : I I : I : : I I I I I : i : : : : | 
Db 377 SDLDSIFNSASTIFTLDVYKL-IRKSASSRELMIVGRIFVAFMWISIAWVPIIVEMQGG 435 

Qy 405 LWYLSSDLVYIVIFPQL LCVLFVKGTNTYGAVAGYVSGLFL RITGGEPYLYLQ 457 

II I : I : I : I I I II I :: I I Ml : I 

Db 436 QMYLYIQEVADYLT P PVAALFLLAI FWKRCNEQGAFYGGMAGFI LVWRLT- - LAFAYRA 4 93 

Qy 458 PLIFYPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAV 517 

I ||:::: : : | : : I : | : : | III: 
Db 494 P ECDQPDNRPVFIKDIHYMYVATALFWITGL-ITVIVSLL TPPPTKEQI 541 



Qy 



518 VARHSEENMDKTILV KNENIKLDELALVK-PRQSMTLSSTFTNKEAFLDVDSSP 570 



i • i • ■ • i i*i f ■ i . • • • i . . i . . : i 

Db 542 — RTTTFWSKKSLWKESCSPKDEPYKMQEKSILRCSENSEVINHVI PNGKS — - — EDS I 595 

Qy 571 EGSGTED 577 

: I II 
Db 596 KGLQPED 602 

Search completed: September 28, 2004, 17:08:30 
Job time : 27 sees 



